Sunday 14 December 2014

How to Start Competitive Programming

If you've ever thought about competitive coding, all you've got to do is start now. Doesn't matter which class or branch you are in. Everyone can code and everyone can be competitive. Basically, all you've got to have is the right mindset and the right skill set.

Let's talk about the mindset first. Competitive programming requires a drastically different mindset when compared to "normal" (or developmental) programming. While in developmental programming you have to concentrate on writing quality code that is manageable, competitive programming generally involves writing code that "just works". This is because you'll probably never even look at your code again, once you get an AC (accepted) result from the grader. This does not mean that you should write sloppy code since that would just make it hard to debug (and this can be a real pain in the a** in a competitive setting, with the added pressure of time running out). All it means is that the code need not have extensive documentation (just some inter-spread comments to help out when debugging are enough) or modularity/objects/classes (just write it in logically separate functions to make life easier).

What about the skill set? Basically, the only requirement is that you should be able to write in the language that the grader accepts without having to look at the documentation for the basics. Since almost all graders support C++ (I haven't found any that don't, except in some "special language contests" etc), and C++ has a massively powerful library called STL (standard template library), C++ is the natural choice for most competitive coders. However, if Java is your mother tongue, a fairly large number of graders accept this as well, but do note that Java is known to be slightly slower and more cumbersome to write in a pressured setting. Java does have one major advantage over C++ and this lies in its BigInteger library. However, this is so rarely used that it isn't worth the hassle of doing everything in Java just for this. Most coders use C++ for everything except for when the question requires large integers (larger than 10^18 or so).

I've put together a small set of FAQs below and this should help. If you do have any more questions, do ask in the comments below.

Q: Which language should I use?
A: C++ (Read above for why)

Q: How should I learn C++?
A: The tutorials on cplusplus.com are extremely useful (http://www.cplusplus.com/doc/tutorial/). Not to mention, their documentation is in depth and is useful later on also, when learning things like STL.

Q: Any specifics of the language that are more useful?
A: C++ is a very powerful language with a gazillion features (and more added with C++11), but a whole large chunk of them are not necessary whatsoever for competitive programming. Basically, you should be comfortable with variable types, I/O (using cin, cout - there are faster ways but those are usually not necessary unless a 0.001 second improvement is the only thing that's causing a TLE - usually, the algorithm can be improved instead), control statements (if, ?: operator), looping (for and while, break and continue), functions (call by reference and call by value), arrays, strings (as character arrays), struct. Once you're comfortable with these, learn how to use basics of STL (vector, sort() are most useful). Pointers and dynamic memory is usually a larger headache than anything in the pressures of a contest. See a further question for more. Also, classes and objects will usually not be used since your code usually depends heavily on algorithms and does not usually break down into a object-oriented way.

Q: What do I do for unknown sizes of memory? Is int x[n]; OK? Can I use pointers for this (using int *x = new int[n];)?
A: NO!!! int x[n]; is NOT OK. Arrays need to have a compile time constant size. n is not a constant at compile time. Pointers and dynamic memory is too much of a headache. Just declare an array that is of the size that is max allowed (based on question). If question says n is less than 10^6, then just declare int x[1000010];. The extra 10 is just to be safe :P. Oh, and put large arrays in global scope (declare outside the function) so that it doesn't cause stack overflow problems (can cause SIGSEGV on graders). And don't forget to initialize them using loops whenever necessary. If you really need a truly dynamic sized array, don't use pointers, use std::vector (http://www.cplusplus.com/reference/vector/vector/).

Q: Where do I learn algorithms from?
A: Best place I've found is the MIT OpenCoursware. It is a BTech level course (but usually understandable by most Indian school students too). Link: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-006-introduction-to-algorithms-fall-2011/

Q: What if I prefer books instead of videos?
A: You might've heard of the book we fondly call "murder-weapon" due to it's size. CLRS (Introduction to Algorithms by Cormen, Leiserson, Rivest, Stein) is like the bible of algorithms. If you have the guts to read this huge book, by all means, please do. However, if you want something smaller and more manageable, try the book by Skiena (The Algorithm Design Manual).

Q: Which sites should I practice on?
A: SPOJ has the largest collection of problems. CodeChef has a very nice classification of Easy, Medium, Hard problems. UVa has a large number of problems suitable for ICPC training. CodeForces holds extremely regular contests (and editorials are also regular). TopCoder has regular SRMs but has a very messy arena applet that you have to get used to (but tutorials on TopCoder are great). HackerRank has a nice set of problems divided by difficulty and topic. Finally, it is up to you. I currently use UVa and CodeForces the most, but would suggest CodeChef for beginners due to its classification.

Q: What to do if I get a TLE (Time Limit Exceeded)?
A: Usually your algorithm might not be good enough. Check if the number of elemental operations (like single element access or write or operation etc) totally stays bounded by 10^6 or 10^7. If so, it should run within a second. Otherwise, you're in trouble. Even an algorithm that takes 10^8 operations would be too slow. For example, on an array of size 10^5, a sort using bubble sort (requiring quadratic time) is too slow, but merge sort (requiring N log N time) runs within a second.

Q: What to do if I get a MLE (Memory Limit Exceeded)?
A: You really need to cut down on what memory you are using. Reuse memory space if possible. Usually this verdict from a grader is very very rare. I've got it only once or twice totally in my whole experience until now, and it was fixed by rethinking the usage of memory. A million sized array if integers takes up 4 MB of space, and million 'long long's take up 8 MB space, so this is usually not a problem at all.

Q: What to do if I get a WA (Wrong Answer)?
A: Check if you are missing any \n while outputting your answer. This is a common mistake that might happen even if code and algo is correct otherwise. If that isn't the problem, try running the algo on paper and see that it matches for all cases you can throw at it. If it still works, you haven't tried some corner cases (n=0,1, or very very large). Also, are you sure you are using the right data type? int stores max of ~10^9, long long stores ~10^18. Use it instead. Remember, intermediate calculations also should fit within the limit. If all else fails, try rethinking the algorithm.

Q: I cannot figure out the algorithm. What should I do?
A: If this is in a contest, there is usually not much you can do other than just think or move onto another question. However, if this is when practicing, try working out the example on paper first to make sure you've understood the question. If nothing yet, don't give up (or check online yet). Mentally take note of the question and go on with other work. The answer will strike at the most random time (in the shower for instance). If you still get nothing (after a few days/weeks/months depending on difficulty of question), search for an editorial for the question, or ask someone who's solved it. If it involves a new concept, learn it (don't just copy code without understanding). If it did not require anything new, then re learn whatever was needed. After that, try coding it in and getting an AC. Search for related concepts and questions and solve them.

Q: I know the algo, but cannot code it in. What should I do?
A: Isn't this obvious? Relearn the topics needed.

Q: Can you see my code for problem XXX from site YYY?
A: Nope. I do not have the time to do this individually. Also, competitive code is usually hard to read and difficult to find mistakes in. I usually cannot understand my own (competitive) code after a few days (developmental code is understandable even after years however, since it is written that way). You're on your own. However, if you write the algo in simple English (or pseudo code) then it might help for you to figure out what is wrong by yourself. Also online forums exist just for this. If nothing else, if you can find me a great question, I might just write an editorial on it (see example here).

Q: My question is not in this FAQ.
A: Write it in the comments below.

9 comments:

  1. HackerRank also have nice collection of problems sorted both topic wise (useful when you need to strengthen a particular topic like eg. DP) and difficulty wise.

    ReplyDelete
  2. Thanks a lot that helped a lot! :)

    ReplyDelete
  3. Can we start competitive programming without havin knowledge of algorithms yet?
    I am in the third semester and the algorithm course is in fourth sem. Would you recommend going for competitive programming after taking that course or can I start now even?

    ReplyDelete
  4. I see it's helpful post from various aspects.

    ReplyDelete
  5. I am master's student in data science field. So, I am well with python. I want to start competitive programming for purpose of getting into big product based companies. So, should I stick to python, or C++ will be much better option?

    ReplyDelete
    Replies
    1. I wrote the post approx 3 years ago, and I believe my thoughts have changed a bit. I believe that the best language to pick is the one that best suites the task, and for this, one needs to get proficient at a whole variety of different languages. Each "unlock" a different style of thinking, and lead to overall growth; so my long-term suggestion would be to learn both.

      For the short term however, pick the one that is permitted to be used in the sort of contests that you will be taking part in. If both are allowed, pick whichever you are more comfortable in, but keep in mind what the costs for this might be:

      Python tends to be a small constant factor slower than equivalent C++ when running. However, C++ tends to take more effort to write.

      Python's dynamic typing make debugging a pain in the rear. However, C++ is definitely more verbose and requires spelling _everything_ out to the compiler, even if it might be able to infer stuff.

      Python supports higher-order functions more nicely. C++'s support is much more forcefully tacked on.

      Overall, the best bang-for-buck is not the languages though. The best value in my opinion is gained by _thoroughly_ understanding algorithmic concepts. I still stand by my recommendations for the videos and books mentioned in my post; and while they will take effort, I would recommend trying to implement everything you learn in your language, and store every single answer, and as you go further along, look back and see if you could have done it better.

      Alongside this, practice, practice, practice. There's tons of content available online, and merely learning the concepts is not enough. You have to actually practice in the real world.

      Hope that helps :)

      Delete
  6. that's impressive . keep it up :)

    ReplyDelete