Indic Meet 2009, Pune

May 21, 2009 at 5:47 am (Uncategorized) ()

I had the privilege of being invited to the Indic Meet 2009 held at the Red Hat office in Pune. It was organised by IndLinux. Here is the event details page and schedule. Here I shall present a personal view of the experience.

My flight landed an hour late at Pune airport at 12 midnight. It was my first flight and I thoroughly enjoyed it! I found Ramkrsna waiting for me outside and he took me to my hotel on his bike. Ravikant from Sarai was my room mate.

Next morning I met Gora and Shantanu for the first time. They had come to our room to say hi. We then made our way to the Red Hat office at Kalyani Nagar. I was also looking forward to meeting Shreyank who recently joined them. We are college friends.

At the venue I met many new people. I met Runa, Ravishankar Ji, Parag Nemade, Pravin Satpute, Rahul Bhalerao, Santhosh, Amitakhya, G. Karunakar . As soon as I reached the room a discussion on OCR ensued. Soon Mr. Sankarshan came in and the meet was underway. The rest of the proceedings were divided into 2 tracks, Localisation and Development. We moved to another room to discuss development.

The first session was on “Locales, and related work” by Parag and Rahul. There was an involved discussion on the various problems in glibc rules of collation and sorting rules. Santhosh shared his knowledge. I sat all the while learning stuff. Gora sitting beside me was frantically typing all the points made on the IndLinux wiki. Mr. Sankarshan would drop in from time to time to check whether we were working or gossiping 🙂

We heard that the localisation track was ahead of time and we moved to the localisation room after we were done. Soon after was lunch. It was decided that OCR be discussed with everyone since localisation track was ahead of schedule.

Next up was my presentation on Indic OCR. Here is the slide. There were a few questions which I was happy to answer. Next up were Gora and Shantanu presenting their own approach for the OCR problem. Ravishankar Ji gave a valuable input by demonstrating Sanskrit OCR later.

We then moved to another room and sat with a few language teams. People asked me to demonstrate how a new language is trained for the OCR. I decided to make training files for Tamil. We went to wikipedia and copied all the tamil characters in the alphabet and fed it into the trainer. We then took a screenshot of some tamil text from tamil wikipedia. We then ran the ocr and it gave reasonably good results. There was an issue with some of the vowel signs being messed up. We had an accuracy of around 85%. I was then requested to provide the instructions and make the process easier. I have since then made some changes to the code and committed some changes. I am yet to modify the README and blog about it in detail.

As the day came to a close, I went to Shreyank’s guest house and had some fun. We met Pradeepto who had come to meet us. I was very happy to see him. Later that evening we went to a restaurant and had sizzlers. Later that night I spoke to Ravikant at length about Sarai and his interests there.

The next day we started with Gujarati localisation. Pradeepto showed up at the office to say hi. I was thrilled to see him wearing our NIT DGP LUG Tshirt! We discussed some Python bits in the cafeteria. When I came back Runa was giving her talk on translation quality assurance which i found interesting. Ravikant gave his inputs on a recently held translation sprint he organised at Delhi.

Next up was Gora talking about dictionaries. He showed us in detail how to create a dictionary from scratch. Karunakar gave valuable inputs.

Post lunch there was a talk on Online handwriting recognition by Rahul which captured everyones attention. Santhosh gave so many talks, since he works on so many projects. I particularly liked the N-gram model word level suggestion engine he is trying to build. We discussed a few ideas. I also enjoyed G. Karunakar whenever he spoke. He is so damn cool!

I would also like to mention Sandeep Shedmake whose views on applying TQM to localisation seemed far fetched at the time, but was an excellent vision nevertheless. I had some discussion on that IRC later.

The day was drawing to a close and we found ourselves discussing what IndLinux meant. Venkatesh Hariharan came in and joined the discussion. We decided to change a lot of the wiki and web pages to make it more productive and lucid. There was also a lot of talk about setting up an IndLinux society. The day ended with people saying good byes and “see you at” 🙂

All in all the experience was thrilling for a student like me. I was probably the least knowledgable guy there and all I did was learn more. I will always rue not having a camera with me though. I am buying a camera a first thing with my first salary 🙂


  1. atul Verma said,

    Hmm nice post. Would like to know more about Indic . wht is all about . Who people and an individual can contribute.

  2. रवि said,

    Great, picture by picture account. Every incident recalled and recollected. Nice meeting you there. Some day we will together have MACHER JHOL and BHAAT. 🙂

  3. Suresh said,

    I couldn’t attend as I was at Vaishno Devi with family at the time.

    Nice to read how the meeting proceeded. Sounds like it was fun, enthusiasm and comraderie.

    Hope to join in future meetings 🙂

  4. Shreyank said,

    Even I am considering buying a Camera soon….

  5. kkn said,

    pdf format is not free. this gets me excited. if only i knew a bit more programming!:(

  6. Rahul Sundaram said,

    Is there any effort going into packaging the OCR program for Fedora? If you are interested, drop me a mail.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: