Sidhant Bansal
https://sidhantbansal.com/
Mon, 23 Sep 2024 10:26:53 +0000Mon, 23 Sep 2024 10:26:53 +0000Jekyll v3.10.0To concentrate or not to concentrate<p>To use concentration bounds or <strong>anti-concentration</strong> bounds, that is the question?</p>
<p>In one of my previous posts reviewing the CS5330 course, I explained how many randomized algorithms exploit concentration bounds, in particular the chernoff bound. Up-till my third year I was under the impression that why would we ever want to argue in the reverse way? Doesnt make sense right, why would would you want to decrease the odds of a good event?</p>
<p>Well turns out, in some cases, anti-concentration bounds are also useful. I will be sharing two examples in this post, first one I encountered as a TA for CS4231 - Parallel and Distributed Algorithms course, which is in the consensus setting. Second one, I came across in my final year research thesis.</p>
<h2 id="problem-1">Problem 1</h2>
<p>(Credits: Rephrased from one of the course examination questions set by Prof. Haifeng Yu for CS4231)</p>
<p>Let me first give the setup of the well-known classical consensus problem.</p>
<p>You are given $N$ nodes, labelled from $1$ to $N$. Each node originally holds a binary value (say $b_i$), i.e. either $0$ or $1$. All the nodes want to come to consensus on a common value, i.e. $0$ or $1$. The method of communication between these nodes is <strong>asynchronous</strong> pair-wise.</p>
<blockquote>
<p>In <em>Asynchronous</em> channels there is no gaurantee that a message from node $a$ to node $b$ will be at most $k$-times slower than the message from say node $c$ to node $d$, for any $k$. The messages between a given pair of nodes will come in-order for sure though. Easiest way to digest the asynchronous model idea is, that you have no easy way to say that the message has <em>timed-out</em> or got <em>dropped</em> since there is no well defined upper-bound. Also running rounds-based algorithms becomes tricky in this setting.</p>
</blockquote>
<p>Besides the communication setting, you want to satisfy three basic conditions of consensus which are as follows:</p>
<ul>
<li>Agreement: All the non-faulty nodes should agree on a common value</li>
<li>Validity: If all the non-faulty nodes have the same initial value (i.e. $b_1 = b_2 = \dots = b_n$), then the agreed upon value by all the non-faulty nodes should be that one.</li>
<li>Termination: All non-faulty nodes should terminate and eventually decide on a value</li>
</ul>
<p>So well if no node is faulty, then this sounds trivial right, everyone just sends their original value to one another and they look at it and decide on a common value. They could use a simple vote-majoirty or max of all values and be done. Notice, they cant do something trivial like always decide on value $0$, because the <strong>Validitiy</strong> condition will break if $b_1 = b_2 = \dots = b_n = 1$ in this case.</p>
<p>So now comes the interesting bit in this setting. At most one node can be failed at any abitrary point of time during its algorithm.</p>
<blockquote>
<p>And think of a devil (aka <strong>strong-adversarial model</strong>) looking over the algorithm and the message communication model which decides to fail a node at the most inconvinient time for your algorithm. This devil can enforce the worst case scheduling in the asynchronous model. Now how do these nodes come to consensus? Well the <a href="https://timroughgarden.github.io/fob21/l/l4-5.pdf">FLP theorem</a> states that this is impossible to achieve. The proof is quite complex, took us two lectures to cover it in CS4231, attached above is a good resource available online that goes over it.</p>
</blockquote>
<p>As a thought exercise, try to argue why an algorithm like the one defined below will fail when the initial values are $(0, 0, \dots, 0, 1)$ or $(1, 1, 1, \dots, 1, 0)$ or $(0, 0, \dots, 0, 1, 1, \dots, 1)$ (50% 0s, 50% 1s)</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> 1. Send my own value to every node (including myself)
2. Wait to collect the first $N-1$ values
3. based on a bit-majority / maximum over these values come to a decision
</code></pre></div></div>
<p>The question that I encountered relaxes the constraints, mainly we are provided with two additional things:</p>
<ul>
<li>Each node has a private fair coin that they can flip and generate private random bits. The devil has no control over the randomness, however they can look at these random bits before time.</li>
<li>We dont need to come to consensus with 100% probability. We just have to maximize our probability and make it a function in $N$, i.e. as $N$ increases our probability of success should tend to $1$. Note: The probability is being taken over the space of all the random bits that will be generated by all nodes. The original binary values that each node has as well as the devils scheduling of messages and failure of a node are all still running in the worst-case scenario, so no leeway there.</li>
</ul>
<p>Well how will random bits help us? Try to use this hint that the question had associated with it:</p>
<p>Let $X = X_1 + X_2 + \dots + X_k$ where each $X_i$ is fair coin toss, i.e. 0 if Tails and 1 if Heads. Then $\Pr(X = i) \leq O(\frac{1}{\sqrt{k}}), \forall 0 \leq i \leq N$</p>
<p>Hmm, this sounds like an anti-concentration inequliaty. It is an upper bound of how much probability mass can be present at any point for a Binomial distribution. The proof for the statement above is, $\Pr(X = i) = {k \choose i} \times \frac{1}{2^{i}} \times \frac{1}{2^{k - i}}$ which is maximum when $i = \frac{k}{2}$, when we get $\Pr(X = k/2) = \frac{k!}{(k/2)!(k/2)!} * 2^{-k}$ if we use <a href="https://en.wikipedia.org/wiki/Stirling%27s_approximation">Stirling’s approximation</a> here, it simplifies this expression to $O(\frac{1}{\sqrt{k}})$.</p>
<h3 id="try-to-think-of-how-we-can-apply-this-bound-and-use-the-private-coins-to-help-us-out">Try to think of how we can apply this bound and use the private coins to help us out?</h3>
<p>Well the idea is pretty straightforward, every node should generate a $0$ or $1$ using the fair coin (let this be $r_i$) and broadcast this (in our message model, the broadcast technically happens one-by-one) to every node (including itself). Once a node (say node $i$) receives $N - 1$ such values, it should take their sum and denote it by $\alpha_{i}$.</p>
<p>Notice, that $|\alpha_{i} - \alpha_{j}| \leq 1, \forall 1 \leq i, j \leq N$. reason being say $\alpha_i$ excluded $r_x$ and $\alpha_j$ excluded $r_y$, in that case, the net delta because of this would be $|\alpha_{i} - \alpha_{j}| = |(\sum_{a = 1}^{n}r_{a} - r_x) - (\sum_{a = 1}^{n}r_{a} - r_y)| = |r_x - r_y| \leq 1$</p>
<p>Tldr; of this entire $\alpha_{i}$’s idea is, that we have come up with an <em>almost</em> random number from $0$ to $N$. I say <em>almost</em> here because the $\alpha_{i}$’s can actually form two values, say one is $Z$ then the other is $Z + 1$, because of the constraint $|\alpha_{i} - \alpha_{j}| \leq 1, \forall 1 \leq i, j \leq N$.</p>
<p>Also notice that the distribution this $Z$ follows, is binomial, since its the sum of $N - 1$ indepdendent bernoullis, so $\Pr(Z = c) = O(\frac{1}{\sqrt{N - 1}}) = O(\frac{1}{\sqrt{N}}), \forall 0 \leq c \leq N - 1$ holds.</p>
<p>For the time being, ignore the fact that some nodes came up with $Z$ and some with $Z + 1$ and just think that every node knows the same value $Z$. So now having come up with a consensus on a random number between $0$ and $N - 1$, how does that help us?</p>
<h3 id="idea-1">Idea 1</h3>
<p>Make node $Z$ the leader? And then whatever input value node $Z$ has, everyone follows that value. My natural instinct was to think of this, doesnt work. why?
Well, because the devil controls the failure of nodes and can look at the random bits. So the devil can correctly calculate $Z$ and ensure that this node fails, and therefore node $Z$ will just fail before it can tell everyone what its input value is.</p>
<h3 id="idea-2">Idea 2</h3>
<p>Okay, so electing a leader using random bits doesnt work. Lets fall back to the simplest approach of everyone broadcasting their input value ($b_i$), collecting the first $N - 1$ values we see and doing a vote-majority on them. What is vote-majority? We simply say if I receive >= 50% 0s then I will decide on $0$ otherwise $1$.Why does this fail? Because the devil knows that the cutoff is 50% votes, so in the input configuration $(0, 0, \dots, 0, 1, 1, \dots, 1)$ (i.e. 50% 0s, 50% 1s) it can always ensure that atleast two nodes come with contradictory values, one node might see = 50% 0s, but the other might see 49% 0s (because the devil failed one of the nodes which has $b_i$ = 0, mid-way). In that case, former will decide on $0$ and latter on $1$ thus breaking <strong>Agreement</strong> condition. Now how does having a random variable $Z$ help tweak this algorithm?</p>
<p>What if we tweak the algorithm to: If a node sees more than >= $Z$ 0s then it decides on $0$ otherwise $1$. How does this do?</p>
<p>Well it breaks <strong>Validity</strong> when all nodes have $b_i = 1$ & $Z = 0$, but this is a extremely low probability event, since $\Pr(Z = 0) = 2^{-N}$. And in the other extreme case, when all nodes have $b_i = 0$, then we are good, since $Z <= N - 1$</p>
<p>What about <strong>Agreement</strong>? It breaks that too, when one node sees $Z - 1$ 0s, but another sees $Z$ 0s. But this only happens when there are $Z$ 0s and $N - Z$ 1s in the input values. Given a particular configuration say $10%$ 0s and $90%$ 1s, what are the odds that $Z = 10% \times N$, well its at most $O(\frac{1}{\sqrt{N}})$ because of the upper bound. So we conclude, that for a given configuration the bad case when $Z = \#(b_i == 0)$ is rather low.</p>
<p>But recall that actually some nodes came up with value $Z$ and some with value $Z + 1$, loosely speaking this double the odds of failing (you can work out the exact math on paper-pen). So at an overview:</p>
<p>Pr(Failing) = Pr(No Agreement or No Validity or No Termination) $\leq$ Pr(No Agreement) + Pr(No Validity) + 0 (since this algorithm always terminates)</p>
<p>$ \leq 2 * O(\frac{1}{\sqrt{N}}) + 2 * 2^{-N} \leq O(\frac{1}{\sqrt{N}})$, for fixed input values configured, over all the random bits generated.</p>
<p>QED.</p>
<p>Note: The exploit of randomization here loosely follows the same idea that <a href="https://codeforces.com/blog/entry/70203">this riddle: Two papers</a> uses.</p>
<h2 id="problem-2">Problem 2</h2>
<p>In my final year reseach thesis, we proved a lower bound for 1-bit compressed sensing problem in a very specific setting (publication <a href="https://arxiv.org/abs/2202.10611">here</a>, refer to the Technical Overview section to understand the high-leve idea)</p>
<p>I will explain the <strong>(n, k)-balanced</strong> problem here which reduced to our 1-bit CS problem.</p>
<p>A collection of vectors $V = \{v_1, v_2, \dots, v_m\} \subseteq \{\pm 1\}^{n}$ is (n, k)-balanced if for every subset $S \subseteq \{1, 2, \dots, n\}$ of size $k$ there exists $v_i \in V$, such that $|\sum_{j \in S}v_{i, j}| \leq 1$, i.e. for every given $k$ sized subset of indices, there exists atleast one vector in our collection, whose sum over those index positions is <em>almost</em> $0$</p>
<p>In our research, we were arguing that if each vector in $V$, is constructed by a naive process of coin flipping, i.e. $\forall v_i \in V, \Pr(v_{i, j} = 1) = \Pr(v_{i, j} = -1) = \frac{1}{2}, \forall 1 \leq j \leq n, \forall 1 \leq i \leq n$, then $V$ will fail to satisfy the (n, k)-balanced condition with <strong>at-least</strong> a constant probility when the size of the collection, i.e. $m = |V|$ is too small.</p>
<p>Note: Here $v_{i, j}$ denotes a rademacher random variable, it is a random variable that is +1 with 50% chance and -1 with 50% chance. Kind of similar to typical bernoulli, just split between $\{-1, 1\}$ instead of the typical $\{0, 1\}$.</p>
<p>Let $F_s$ be the event that for every $v_i \in V, |\sum_{j \in s}v_{i, j}| > 1$, then if $m = |V|$ is too small, we want to show that $\Pr(\bigcup_{s \in \{\pm 1\}^{n}}^{|s| = k}F_s)$ is lower bounded by some constant.</p>
<blockquote>
<p>Fact: We used De Caen’s inequality for this (unrelated to the core idea of this blog, but relevant anyways), which states: $\Pr(\bigcup_{i \in I}A_i) \geq \sum_{i \in I}\frac{\Pr(A_i)^2}{\sum_{j \in I}\Pr(A_i \cap A_j)}$, where $\{A_i\}_{i \in I}$ is a finity family of events.</p>
</blockquote>
<p>To plug in $F_s$’s for the $A_i$’s in this inequality, we need to calcualte $\Pr(F_s), \forall s$ and $\Pr(F_s \cap F_t), \forall s, t$, where $|s| = |t| = k$. Notice that all $\Pr(F_s)$ will look identical, because of symmetry. Ignore how the intersection event will be calculated precisely (for the purpose of our original problem, this was rather involved and cumbersome).</p>
<p>To understand the anti-concentration inequalities concept better, lets just focus on calculating $\Pr(F_s)$.</p>
<p>Let $E_{s, i}$ denote a single failure event, i.e. $|\sum_{j \in s}v_{i, j}| > 1$, such that $F_s = \bigcap_{i = 1}^{m}E_{s, i}$</p>
<p>To lower bound the the probability of the big bad event $F_s$, we need to lower bound the probability of small bad events, i.e. $E_{s, i}$. Notice that $E_{s, i}$ is basically commenting that the sum of <em>binomial-looking</em> like distribution (which is made up of sum of $k$ independent rademachers) must be away from its center.</p>
<p>We already know that the probability of a binomial being on the center (or any other place) is $\leq O(\frac{1}{\sqrt{k}})$, so the other way around we can say, it will <em>NOT</em> be at the center with at-least probability $\geq 1 - O(\frac{1}{\sqrt{k}})$. This gives us $\Pr(E_{s, i})$</p>
<p>Since all $v_i$’s are indendepent, so $\Pr(F_s) \geq (1 - O(\frac{1}{\sqrt{k}}))^{m}$.</p>
<p>Notice here, if the anti-concentration of binomials had stated that the upper bound is $O(\frac{1}{k})$ (a hypothetical), instead of $O(\frac{1}{\sqrt{k}})$ then our net lower bound result on $\Pr(F_s)$ would have improved (i.e. got higher), thus giving us a tigher lower bound on $m$ (i.e. proof of a larger $m$ not being sufficient for the (n, k)-balanced property to hold).</p>
<h2 id="takeaways">Takeaways</h2>
<p>Anti-concentration inequalities can be useful when you know that in an imaginary world if you could get a uniform distribution, that that would yield you a better result than the distribution you are currently working with.</p>
<p>Think about it, in Problem 1, if we could come up with $Z$ through some construction such that $\Pr(Z = i) = \frac{1}{N}, \forall 1 \leq i \leq N$, then that would be ideal. It would make all the input $b_i$ distributions equally bad for us, each failing with at most $O(\frac{1}{N})$ probability, but in the real world we know that $\Pr(Z = N/2) = O(\frac{1}{\sqrt{N}})$, so the input distribution of $b_i$ being $(0, 0, \dots, 0, 1, 1, \dots, 1)$ impacts us the most adversely and gives us the fail probability of $O(\frac{1}{\sqrt{N}})$</p>
<p>Similarly in Problem 2, we stated the argument above on how a uniform distribution would have helped us, to further tighten the lower bound on $m = |V|$</p>
<p><a href="/"><img src="/images/binomial_ex1.png" alt="Example of binomial distributions" style="width: 800px;" /></a></p>
Sat, 04 Feb 2023 00:00:00 +0000
https://sidhantbansal.com/2023/To-concentrate-or-not-to-concentrate/
https://sidhantbansal.com/2023/To-concentrate-or-not-to-concentrate/My CP Journey - Part 2<p><a href="https://www.sidhantbansal.com/2019/My-Cp-Journey-Part-1/">Here</a> is the first part. So I joined NUS (National University of Singapore). The competitive programming environment in NUS is competitive to say at the very least.</p>
<p>The main driving force for CP in NUS is Prof. Steven Halim, who is the head coordinator for the ICPC within NUS + Author of CP3 book + Involved in IOI as Singapore head coach. His efforts + the good reputation of NUS attracts a few medallists from different olympiads every year. This ensures a regular supply of smart and competitive students interested in ICPC.</p>
<p>Now let me first explain how ICPC regionals work in South East Asia. In South East Asia, you are allowed to attend at most two regionals in a single year and you are NOT restricted by national boundaries. That is you are allowed to compete in regionals of other countries. I think the reason behind this is that countries in South East Asia are not big, especially Singapore (which only has 4 - 5 technical universities) so you cannot conduct a regional with only teams from a single country. Also most regional sites have a quota for international teams, for example say Jakarta, Indonesia regional will have 30% quota of international teams where say NUS got allocated 4 slots (by the regionals administration team), then this means that the NUS coach can send any 4 team he wants (based on NUS’s internal selection rules) to represent NUS at this regional. Unlike India, the international teams more often than not, don’t have to sit in any sort of official prelims conducted by the regional site. Instead NUS has its own internal selection contest of some sort.</p>
<p>In my batch we had 5-6 new IOI kids (Me, Bernard, Ranald, Sergio, Robin, Kwee Lung, Minh).</p>
<p><strong>1st Year</strong>: The ICPC selection process was as follows: An individual contest happened with students from all batches that do CP and let the ranklist obtained via it be represented as a set S.
Then,</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>int team_id = 1;
while(!S.empty()) {
int leader = *S.begin();
S.erase(S.begin());
pair<int, int> x = teammate_picks_of(leader);
S.erase(S.find(x.first));
S.erase(S.find(x.second));
cout<<"Team "<<team_id<<": "<<leader<<" "<<x.first<<" "<<x.second<<endl;
team_id++;
}
</code></pre></div></div>
<p>In my first year with the competitive pool of students in NUS I got a rank of 15ish in the above mentioned contest and was selected by Bernard into his team and our team formed was Me, Bernard and Si Wei. Our team name was 3body (motivated by an astronomial problem of 3 body systems)</p>
<p>We didn’t really do a lot of practice contests before competing for our one and only regional this year (we competed in only one regional because we were 5th team in the ranking algorithm) in Jakarta, Indonesia. Surprisingly we did fairly well at Jakarta (from our expectaions) and got 5th position in that regional.</p>
<p>Later on this semester I also took the course CS3233 (a course on Competitive Programming, taught by Prof. Steven) along with all the other IOI kids from my batch and was introduced to Bay Wei Heng. The course itself is a very fast paced version of how one should go about CP and most (i.e 80%) of the students in the course have prior background either through school olympiads or by acquiring a lot of points on Kattis (an online judge, which is very trendy in our university). The course is fun, it is quite competitive though with weekly contests in a high-pressure 1:15 minute environment consisting of good problems and cash prizes. The course also attracts a few companies which are scouting for students for internships, which makes getting internships a slightly easier process.</p>
<p><strong>2nd Year</strong>: For our next year, we changed our team slightly and now the combination was Me+Bernard+Wei Heng. Our team name was 3body2 (= 3body + 1). We did a fair number of practice contests (Mainly on CF Gym) during this period and realised our strengths and weaknesses + how to collaborate as a team. The TLDR; version of our team combination was that Wei Heng did most of the math and flow related questions, Bernard was particularly good at constructive algorithms/geometry/greedy and I was responsible for primarily data structural questions (segment trees, hld, lca, hashing) and dps. We did overlap in dp, greedy and graph a lot like most of the other teams would since these are the topics in which almost all people are good. My background in data structures is to be honest at most above average/reasonably good from an Indian Competitive Programming scene standpoint, because we use so much data structures in India and so everyone is quite proficient in DS, however coming to university I realised that the South East Asian programming culture is more balanced with a good mix of mathy + constructive + geometry + flowy questions too. Because of this balancedness my same skill set that I had in India, now got portrayed and exploitable as a Data Structures guy since the level of data structures here was relatively lower.</p>
<p>The rule for internal tie-breaking within our university this year to select a team for WF (ICPC World Finals) was to minimize the following:</p>
<p>Rank at Singapore regional * constant_1 + Rank at 2nd regional * constant_2 + Rank in internal contest * contest_3</p>
<p>I don’t remember the constants as of now, but the important thing is constant_1 > constant_2 > constant_3.</p>
<p>This year we ranked 4th in the internal contest and were competing in two sites: Yangon, Myanmar and Singapore. Yangon, Myanmar as an ICPC site had the reputation of having erratic results because of problems with weak test data + problem statements sometimes not written clearly enough. Because of the erraticity of results, a team like ours also did have a good shot at getting a good rank and fortunate for us we did end up getting 1st rank. (Can read about our experience <a href="https://docs.google.com/document/d/1fAsCm060ADOVNQvImvzTcKAL_pc_8q6Cbs4FyJA65PY/edit?usp=sharing">here</a>). Our second and final site was Singapore, where our team + SendBobsToAlice + Pandamio, all had a chance of securing a slot for world finals. In the Singapore regional, we did okayish and got rank 11. Our rank wasnt that good, but it made us second in the tally (within us + SendBobsToAlice + Pandamio). Also SendBobsToAlice (first in the tally) decided not to go this year (since some of their teammates had already gone to WF before, and they wanted to save their second and final chance to WF for a future iteration, when their team is more prepared and could potentially secure a very good rank). So because of the pass by SendBobsToAlice, our team got the slot and we represented NUS at Portugal, ICPC World Finals 2019.</p>
<p>Before the World Finals I got to attend Moscow Workshop conducted by MIPT, the workshop was a 1-week long workshop in the heart of Russia at Moscow. Each day we used to have a contest with quite tough problems (out of 8-10 problems, me and my teammates were able to solve 3 on average) and later on an analysis discussion of the problems. We also got to see some touristy spots of Russia and the snowy weather was also a different experience. The contests consisted of high quality problems involving novel techniques and very strong teams participating (we generally used to end up in the middle of the rank table). The accomodation and recreational activities were also good. The camp itself was funded by NUS-ICPC (through the generous donations that Prof Steven Halim acquires from various companies) so we didnt have to fund our flight/camp fee. But incase anyone does plan to compete in the camp using their personal funds, then I would recommend to go for remote participation for a better worth-your-money experience.</p>
<p>The world finals itself is an enjoyable event and a kind of reunion for me, where I got to meet my friends from Indian universities after a long while. I met Rajat De, Pratyush Aggarwal two students from my high school. Along with Kushagra, Sreejata, Debjit, Sampriti, Xorfire, Parth, Hanit all of us having been to Indian IOI-TC (India IOI Training Camp) at some point or the other. The contest itself was good and we did okayish, we got 4 problems solved but for two of the problems we wasted a lot of time debugging our solution costing us a potential solve at another problem. The scope for improvement in this contest was to have made fewer bugs and/or debug them faster so we could have solved more problems. In the end we performed below our expectations, getting a joint rank of 60ish whereas we believed our best case hypothetical scenario was under 30.</p>
<p><strong>3rd Year</strong>: We stick with the same team combination and name ourselves 3body3 (3body2 + 1). We did two regionals again this year, one in Bangkok, Thailand where we ranked 4th and missed out on a potential second position because of a last minute bug on one of the problems. And the second one in Kaula Lampur, Malaysia (our experience <a href="https://algorithmics.comp.nus.edu.sg/2019-KualaLumpur-Report-team-3body3.pdf">here</a>) probably our best performance till date, where we finished the entire problem set at around 4:30 mark and secured the first position comfortably. The internal tie breaking rule this year was sum of ranks at the two regionals. Our sum was 4 + 1 = 5. After some 3 - 4 regionals in different regions of South East Asia had happened, the list for potential WFists from our univeristy was reduced to us or SendBobsToAlice (who came 1st in Jakarta, Indonesia). Now the Vietnam site was left where SendBobsToAlice ranked > 4, therefore making their sum > 5 (i.e ours), and we got lucky again. Do let me clarify that SendBobsToAlice is a stronger team than ours on an average day, i.e in > 70% of the contests where we and SendBobsToAlice both sat together they have beaten us. Howver, luck happened to be on our side this time around and we got through. This just shows that ICPC selection mechanism is high variance and involves some luck for things to go your way. We did secure our slot for WF this year which was supposed to be conducted in Moscow in mid-June 2020. But because of the entire COVID-19 situation it got postponed and got scheduled for 2021. Let’s see what happens then :)</p>
<p><strong>Camps attended</strong>: I mainly attended the Moscow camp during my university education. Once in Moscow before first WF. Second time they came to Singapore before the next year regionals season. And the last time in April 2020 (Before the “original” WF 2020 dates) remotely. I would say everytime the problem set was high quality. Relatively speaking the Singapore one was more doable and the other two were much harder. You get to see plenty of new techniques. Only thing which I feel they dont put an effort to improve is the analysis discussions, where they can be more lucid in explaining and actually discuss implementation details for questions with hard implementations.</p>
<p><strong>Problem Setting</strong>: During my high school period, I wasn’t much into problem setting and had only set a few codechef long challenge problems. But coming to college I was give much more opportunities at Indian IOI Training Camp, Indian ICPC Regionals, NUS Course Curriculum and Singapore IOI Selection tests. I find problem setting pretty fun. My way to go about setting problems would be to maintain some sort of .txt file on my PC with all the new ideas I see or interesting observations I make in some problems/theory I read (may it be CS coursework, some theorem I read on wikipedia or even some Math coursework problem) and later when some contest (say ICPC Regionals) is nearing, then I would try to form some sort of problem using those ideas. I am glad to say that I improved reasonably as a problem setter and tried my best to provide novel problems to the community. The collection of problems I have made can be seen <a href="http://www.sidhantbansal.com/problems-list/">here</a>. I also must add that problem setting is rewarded well in terms of monetary compensation atleast for Indian ICPC Regionals and NUS Course Curriculum which is a bonus as it gives you a good feeling of being productive. It also helps you in thinking how to design test data of certain problems effectively.</p>
<p><strong>Losing Interest in CP</strong>: My interest towards CP has declined over the years and now I am not that fascinated by solving questions / getting ACs. I still like to solve problems every once in a while and am still excited when I see a new CF blog with a new random technique, but yeah my interest in the domain has decreased definitely. I find things to be somewhat repetitive at some level and finding novelty becomes harder when you have seen so many questions/similar stuff before. I think that is part and parcel of any hobby/interest one has. I can only hope that I still finding CP interesting in the years to come. I would say my algorithmic curiosity has not decreased but branched off to other domains that I got to know about in university, mainly Randomised algorithms, Sampling/Streaming algorithms, etc. These kinds of algorithmic courses change our fundamental assumptions on how accurate we want our answers to be, how much memory we have to store the data and how much computation power we have (might be a distributed setting for example). These variants of algorithms allow you to think about algorithms in different ways, at the heart it is still doing logical reasoning to solve a problem but now the tools to solve and the constraints specified are in a different format. I have tried introducing some of these ideas to the CP problems I have set (Ex. I have set a question for which I came up with randomised solution first and then a deterministic one (Relay Marathon) and have set a question about a simple distributed algorithm (Boruvksa’s Algorithm) (Spanning Tree)). I hope these kinds of domains are tested more in CP in the years to come and become more mainstream.</p>
<p><strong>Personal Practice</strong>: I didnt really do a lot of individual practice during my university life. We mainly did CF gym as a team a few times every year and used to compete in the moscow workshops. Apart from these I didn’t explicitly practice a lot on CF or any other judge for that matter. I was able to reach Yellow on CF (and hope to reach Red someday in future), but didnt sit in a lot of contests primarily because of lost of interest in CP + constant academic commitments in university + Singapore time zone doesnt really suit well with CF. I also got this takeaway that there is a law of diminishing returns in terms of how much you gain by practicing as you become better in a certain domain. Even if I practice a lot or reasonably enough, it is hard to see the effect now compared to when I was in School when that effect was much more visible.</p>
<p><strong>Giving back</strong>: I have been fortunate enough to learn so much about CS and Algorithms through CP and friends I made from CP. As a way of imparting my knowledge to other students, I have contributed to <a href="https://www.codechef.com/LEARNDSA">CodeChefs learning series</a> (Kind of like a weekly structured course, aimed to get novice students interested in CP to pace with what kind of theory is expected to be known, etc). Also helped a few students in the Indian school coding scene with resolving their doubts and queries about CP.</p>
<p>In the end I would like to say that I have been more fortunate than smart in my CP journey. Since I am not the smartest CP guy in my college, my two other teammates were hands down better problem solvers than me. And even other teams had better people than us in terms of individual CP skills. But we were fortunate to have a been to regionals which were easier to crack + had a good team combination where we complemented each other strenghts and weaknesses well, i.e we weren’t significantly bad in any sort of problems category and all of us are decentish at implementation so we didnt have the high risk of having a single implementation guy who by mistake might have a bad day.</p>
<p>Also I would like to thank Prof Steven, the main driving force behind NUS ICPC. He went out of his way to conduct internal contests, discuss team stratergies, helped me gain TA experience, etc. He also ensured that NUS ICPC had enough sponsors so that we could attend regionals, WFs and camps without spending a single cent of our own.</p>
<p><a href="/"><img src="/images/scoreboard1.png" alt="Internal Scoreboard in 2017" style="width: 400px;" /></a>
<a href="/"><img src="/images/scoreboard2.png" alt="Internal Scoreboard in 2020" style="width: 400px;" /></a></p>
<p>One of the achievements unlocked by the end of my Year 3 is moving to first position on the internal scoreboard that Steven maintains. Not that it reflects that I was the best, since it takes into account CS3233 grade + Kattis pts + CF rating, etc, which a lot of good people dont put enough effort into.</p>
<p>Well, this post was long overdue. Now I am only left with WF2020 (which I hope happens as expected in 2021) and maybe I will add an update to this post once it happens. But that’s it folks.</p>
<p><a href="/"><img src="/images/meme.jpg" alt="It ain't much, but it's honest work" style="width: 400px;" /></a></p>
<p><a href="https://photos.app.goo.gl/YB6choe9Ky1if63h9">Photographs of all the main events I could collect</a></p>
Thu, 09 Apr 2020 00:00:00 +0000
https://sidhantbansal.com/2020/My-Cp-Journey-Part-2/
https://sidhantbansal.com/2020/My-Cp-Journey-Part-2/Geometric intuition of Bayes' Theorem<h2 id="motivation">Motivation</h2>
<p>Q1 → <em>1% of women at age forty who participate in routine screening have breast cancer. 80% of women with breast cancer will get positive mammographies. 9.6% of women without breast cancer will also get positive mammographies. A woman in this age group had a positive mammography in a routine screening. What is the probability that she actually has breast cancer?</em> <br />
<strong>ALERT</strong> - Solving this question using a calculator and Bayes’ Theorem is easy. What I challenge you to do is without picking up your calculator or writing a single digit, estimate the answer upto an accuracy of let us say $\pm 5\%$. And yes, you are allowed to use pen-paper but only for diagrams.</p>
<p><strong>tl;dr</strong> Try to estimate the answer, without explicitly applying the Bayes’ Theorem.</p>
<blockquote>
<p>But why, would be your next logical question.
Well, take a simpler question.</p>
</blockquote>
<blockquote>
<p>Q2 → Given a 10-faced fair dice, what is the probability that it will land upon a number which is even but not divisible by 3?
You won’t even take 10 seconds, to do this question. You are proficient with a question like this to the extent that you can imagine all the possible 10 dice throws and consider only those in which an even number not divisible by 3 comes. Aka, you can intutively reason about the question without picking up pen and paper.</p>
</blockquote>
<p>But can you do the same for the original question asked ?
Well, if you can and got the answer correct by merely eyeballing, then you may altogether ignore this blog post. But if not, then join me on this fruitless exercise.</p>
<h2 id="abstract">Abstract</h2>
<p>In the coming few paragraphs, We will try to gradually build up on different visualisations for different kind of scenarios. I assume you already would know some of these, but the idea here is not to merely show the method, but to inutively question the “why”</p>
<h2 id="independent-events">Independent Events</h2>
<p>Before handling questions where events are inter-dependent on each other making things complex (as seen in Q1), first let us handle the simple cases.</p>
<h3 id="case-1-a-single-event-using-q2-as-a">Case 1: A single event (Using Q2 as $A$)</h3>
<p>Let us say we have an event $A$. Let, the probability of the event be $P(A)$.
Now how do we visualise this probability -</p>
<ul>
<li>Let all the possibilities be named as $a_{1}, a_{2}, … $. Now formally $P(A) = \frac{n(A)}{n(Total)}$, where $n(A)$ stands for the number of possibilies where event $A$ happends and $n(Total)$ stands for the total number of possibilities.</li>
<li>So we could represent all these possibilities $a_{1}, a_{2}… $ as points on an axis, and let the green points be those which belong to $A$ and red points be those which don’t.
<a href="/"><img src="/images/bayes-1.png" alt="single event" style="width: 400px;" /></a></li>
</ul>
<h3 id="case-2-2-independent-events">Case 2: 2 independent events</h3>
<p>Let us say we have another event $B$ and similarly let us define $P(B)$ as the probability of $B$ happening and $b_{1}, b_{2}, … $ be all the possibilities, out of which a subset of them belong to $B$ and others don’t, such that $P(B) = \frac{n(B)}{n(Total)}$
Let B be those sequence of 2 coin tosses in which #Heads = 1, i.e {HT, TH}, and total sample space be {HH, HT, TH, TT}. We can clearly see that A and B are independent of each other, aka result of one of the events DOES NOT AFFECT the other event.</p>
<ul>
<li>Now how do we represent this visually ?</li>
<li>On the same line can we put both the events? Not in this case, because the underlying sample space is different. For some cases we might have got this opportunity if A and B were on the same underlying space. But even then putting on the same line never turns out to be useful.</li>
<li>So what we do instead is use a 2d space.</li>
</ul>
<p><a href="/"><img src="/images/bayes-2.png" alt="two independent events" style="width: 400px;" /></a></p>
<ul>
<li>Let us try to determine $P(A \cup B)$, it should basically be n(pairs of $a_{i}, b_{j}$, where $a_{i} \in A \text{ or } b_{j} \in B$) / n(all possible pairs), i.e in purple.</li>
<li>And what about $P(A \cap B)$, it should be n(pairs of $a_{i}, b_{j}$, where $a_{i} \in A \text{ and } b_{j} \in B$) / n(all possible pairs), i.e in green.</li>
</ul>
<h2 id="dependent-events">Dependent events</h2>
<p>Dependent events are tricky to work with. You can’t consider them dimensionally different like indepedent events, because one event happening can be related to another event happening. Now each possibility in the sample space can be defined by two things, {did A happen, did B happen}, this {bool, bool} pair represents each sample unit.</p>
<p>Consider $Q1$ where A = having positive result in the test (Green) and B = having cancer (Red)</p>
<p>Now when representing all these events, if we represent them on a line, how will you arrange them ?
You might consider ordering all these 4 pairs, something like this {(True, True), (True, False), (False, True), (False, False)}. But uh, this destroys some relationship informations.</p>
<p>Example: n(True, True) / (n(True, True) + n(False, True)) = $\frac{P(A \cap B)}{P(B)} = P(A | B)$, can no longer be easily seen.</p>
<p>So, we again need two dimensions, however now they are just being used so that the subset/superset relationships between these 4 possibilities can be depicted easily.
The key here is to get WHY is two dimensional venn diagrams enough and WHY does 1 dimensional not make it.</p>
<p>Well, consider what all relationships we want:</p>
<ol>
<li>(True, True) connected with (True, False)</li>
<li>(True, True) connected with (False, True)</li>
<li>(True, False) connected with (False, False)</li>
<li>(False, True) connected with (False, False)</li>
</ol>
<p>Basically (a, b) is connected with (a, not b) and with (not a, b).
Now there are 4 nodes and 4 edges so we can’t represent this WITHOUT a cycle.
It actually looks like a diamond and that is why you need the 2-dimension. In one dimenion, you can represent a chain but not a diamond.</p>
<p>The venn diagram actually emulates this diamond. All the egdes in the diamond can be seen as edges shared between different regions of the venn diagram. Sharing an edge is a much stronger relationship and actually establishes superset/subset area relationship as compared to merely sharing a vertex/point in the venn diagram.</p>
<p><a href="/"><img src="/images/bayes-4.png" alt="venn diagram = diamond" style="width: 400px;" /></a></p>
<h3 id="case-3-2-dependent-events">Case 3: 2 dependent events</h3>
<p>So now just represent A as a circle and B as another circle. What does LHS of Bayes’ becomes ?</p>
<p>$P(B | A) = $Intersection area / Area of A = Purple Area / Green Area</p>
<p>where Intersection area = %age area of B, where the $\%age = P(A | B)$</p>
<p>Therefore Intersection area $= P(A | B) * P(B)$</p>
<p>And Area of A = P(A) = Intersection area + Remaining area $= P(A | B) * P(B) + $ %age area of A i.e not in B $= P(A | B) * P(B) + P(A | B^c) * P(B^c)$</p>
<p>Now put intersection area / area of A = the RHS of Bayes’ theorem.</p>
<p>So, the question remains that you JUST need to estimate the ratio of the intersection w.r.t to A.</p>
<p>Here we go:</p>
<p><a href="/"><img src="/images/bayes-3.png" alt="two dependent events" style="width: 400px;" /></a></p>
<p>The crude approximation here is: Circle B is 1% (given) and intersection area of it is 80% wrt to B (given) , so intersection area is $0.8\% \approx 1\%$</p>
<p>Now the area of Circle A is $\approx 1\% + (9.6\% \text{ of } 99\%) \approx 1\% + (10\% \text{ of } 100\%) \approx 11\%$.</p>
<p>So, approx answer is $\frac{1\%}{11\%} \approx 9\%$</p>
<p>Is this correct ? Nope, but it is close enough. $_{\text{ actual answer is around } 7.76\%}$</p>
<p>Key takeaway: Bayes’ is just a smart way of looking at ratio’s with unknown information.</p>
<p>Things to ponder about: Can we use rectangles instead of circles, when using venn-diagrams ? Also, is there any label we can put on the x-axis and y-axis for dependent events for venn diagrams?</p>
Tue, 06 Aug 2019 00:00:00 +0000
https://sidhantbansal.com/2019/Geometric-Intuition-Bayes-Theorem/
https://sidhantbansal.com/2019/Geometric-Intuition-Bayes-Theorem/CS5330 - Randomized Algorithm Review/Summary<p>So I took <a href="https://www.comp.nus.edu.sg/~gilbert/CS5330/">CS5330</a> a randomized algorithms course this semester in NUS taught by Prof. Seth Gilbert.
The course content in itself is nothing short of beautiful.</p>
<p><a href="/"><img src="https://imgs.xkcd.com/comics/psychic.png" alt="xkcd comic" style="width: 400px;" /></a></p>
<p>I will be describing the broad things I could take away from the course.
All the material + my notes for the course can be found <a href="https://www.dropbox.com/sh/lbvlrll6l7squ6l/AAArkDreW_DpKn4Lmj4PxUYua?dl=0">here</a></p>
<p>Note - I was one of the average students in the course, so don’t consider these as the only possible learnings from this course. In case you are already acquainted with this stuff just read up this <a href="https://www.scottaaronson.com/blog/?p=3712">blog post</a> recommended by our prof. It is the tl;dr version.</p>
<ol>
<li>Inequalities:
<ul>
<li>We are taught a LOT of inequalities, this image consists of all those that were taught and useful.<br />
<a href="/"><img src="/images/cs5330-inequalities.png" alt="Inequalities used in the course" style="width: 400px;" /></a></li>
<li>We have probabilities inequalities in this course like Union Bound, Markov, Chebyshev and Chernoff. These are taught and applied aggressively throughout the course. One imp. thing to note is that if you are bad at a Probability course like MA2216 because you aren’t good with pdf/join distributions/proofs for continuous distributions like Gaussian, Poisson, etc. then it shouldn’t affect your performance in this course, since here the R.Vs are generally Bernoulli or binomial in most cases AND often we are not trying to get a precise answer for a probabilistic event, instead we are always trying to bound it. Getting a hang of where to add/drop stuff when trying to bound things algebraically is a skill that one picks up during this course and which is quite hard to become good at.</li>
<li>We also exploit this kind of algebraic structure a lot in the course: \((1 - \frac{1}{n})^{n * c \log{n}} \leq e^{c * \log{n}} \leq \frac{1}{n^c}\), where c is a small positive integer.</li>
</ul>
</li>
<li>Min-Cut Kargers: An elegant algorithm to do min-cut. Key Insights were \(\rightarrow\)
<ul>
<li>If a problem of size \(N\) (in this case finding min-cut of a graph of size \(N\)) can be shrunk to a problem smaller in size (in this case to the size of \(\frac{N}{\sqrt{2}}\) with a decent success probability (here it is \(0.5\)), then instead of decomposing the problem like a straight chain, i.e to go from \(N \rightarrow \frac{N}{\sqrt{2}} \rightarrow \frac{N}{2} \rightarrow \ldots\), and keep on reducing the success probability from \(\frac{1}{2} \rightarrow \frac{1}{4} \rightarrow \frac{1}{8} \rightarrow \ldots\) (almost 0). We can instead do branching :)</li>
<li>Branching here refers to this: Let us define \(f(N)\) to be the solve function for the problem of size \(N\) which returns the min-cut. Then instead of the chain method where we go from \(f(N) \rightarrow f(\frac{N}{\sqrt{2}})\), now we will do something like this \(\rightarrow\)</li>
<li>\(f(N) = \min(f(\frac{N}{\sqrt{2}}), f(\frac{N}{\sqrt{2}}))\), i.e call two instances of smaller size and take the better answer. (Note - both of them initially work on the same graph of size \(N\), but because of randomness they will be contracting edges randomly, i.e the two instances of size \(\frac{N}{\sqrt{2}}\) being called, won’t be clones of one another)</li>
<li>Now analyzing this branching process you can realize that it has \(2*\log{N}\) layers in the recursion tree and each layer has doubled the nodes of the previous layer. The analysis is similar to a merge-sort algo and slightly slower than the chaining method. BUT the benefit in this approach comes from the fact that the probability of correctness is amplified hugely. Effectively now the success probability of the algorithm = probability that there is at least one path from the root to leaf in the recursion tree that has all success edges, where you can traverse an edge downwards successfully with a probability of \(\frac{1}{2}\). This gets lower bounded by \(\geq \frac{1}{depth} \geq \frac{1}{\log{N}}\) (using a non-trivial tree analysis argument), which is MUCH better than earlier success odds of \(\frac{1}{2^{2\log{N}}} = \frac{1}{N^2}\).</li>
<li>A nice argument is also shown that there can at most be \(N^2\) distinct min-cuts because success probability of Karger’s algo for a specific min-cut is at least \(\frac{1}{N^2}\), so if you add it up for all distinct min-cuts it shouldn’t exceed \(1\), therefore #distinct min-cuts \(\leq N^2\).</li>
</ul>
</li>
<li>QuickSort Analysis:
<ul>
<li>Two ways to analyze the expected complexity. An interesting thing to learn was that JUST commenting on Expected Time Complexity of an algorithm is NOT enough to say it is a good/fast algorithm. Think about plotting Time taken (y-axis) vs Probability (x-axis) graph, it can happen that the tail doesn’t fall rapidly in this graph, so although mean is low, but there is enough variance that often your algo runs super slowly.</li>
<li>I tried to think of an algorithm with this kind of slowness, but I think it is hard to formalize such a case.<br />
<a href="/"><img src="/images/cs5330-distribution.jpg" alt="Faulty distribution?" style="width: 400px;" /></a></li>
<li>Because if this happens then \(E[X]\), no longer remains in \(O(N)\) and instead goes up, since \(constant * \Omega{(N)}\) also contributes to \(E[X]\)</li>
<li>Therefore we also analyze with what probability is the time complexity far away from the mean and try to show that this is very low. Aka \(\Pr(X \geq c*N\log{N}) \leq \frac{1}{N^c}\).</li>
<li>The insight was, that just like in statistics, the mean of a distribution is NOT an idle way to boil down all the information about the distribution, similarly here just boiling down all the information to \(E[X]\) and commenting about it is NOT enough to be confident about the algorithm.</li>
<li>Additional References: <a href="http://www.johnmyleswhite.com/notebook/2013/03/22/modes-medians-and-means-an-unifying-perspective/?fbclid=IwAR0KLvaXqaPYgar02PM7_yrkwMt1Q_yOEI2-N80cHIHjYCUVN8mBViijg-U">Must read, about a unifying way to view mean, median, mode of a given statistic</a></li>
</ul>
</li>
<li>Stable Matching:
<ul>
<li>Not a big topic. We were primarily introduced to deferred decision making and stochastic domination. The problem in itself was put forward as a Balls and Bins problem.</li>
<li>Stochastic Domination although a simple concept, turns out to be very powerful when analyzing something. It basically comments that instead of analyzing a complicated probabilistic event we upper bound that event by a simpler one and analyze the simpler one.</li>
<li>Example: Algorithm \(A\) is successful with probability \(\frac{x}{N}\), where \(x\) is some complicated function of \(N\). But you know \(x \geq \frac{N}{2}\). Then just say let us be pessimistic and say that it is successful with probability \(\frac{1}{2}\) exactly. Then if for this simpler algorithm we realize that it runs in \(N\log{N}\) with high probability, then we CAN SAY that Algorithm \(A\) definitely runs in at most \(N\log{N}\) with same high probability.</li>
<li>Additional References: <a href="https://en.wikipedia.org/wiki/Stochastic_ordering#Usual_stochastic_order">Wiki link for stochastic domination</a></li>
</ul>
</li>
<li>Hashing:
<ul>
<li>Open chaining is reduced to a balls and bins problem and analyzed using that.</li>
<li>Linear Probing has a somewhat hard analysis to grasp. The intuition of making a binary tree to define clusters is still not super clear to me.</li>
<li>I guess one of the key points is \(E[COST] \leq \sum_{k = 1}^{N}k * \Pr(\text{an elt. is in a cluster of size }K)\\ \leq \sum_{l = 0}^{\log{N}}2^{l + 1} * \Pr(\text{an elt. is in a cluster of size }[2^l, 2^{l + 1}])\)</li>
<li>Now we show \(\Pr(\text{an elt. is in a cluster of size }[2^l, 2^{l + 1}])\) is small, that is exponentially decreasing with \(l\). To show this we need 4th-moment inequalities and a non-trivial/magical idea involving a “binary tree” and “crowded contiguous segment definition”. I am still unclear as to why we need all these components for the proof to go through and still in the process of trying to understand this portion.</li>
</ul>
</li>
<li>Flajolet Martin:
<ul>
<li>Perhaps the MOST insane algorithm I have ever seen. The algorithm is like super short and has just 3 - 4 steps mainly. But as a competitive programmer, the magical thing is it enables you to “COUNT NUMBER OF DISTINCT ELEMENTS IN A STREAM/ARRAY USING A MIN FUNCTION AGGREGATION”. Obviously, it has two parameters for controlling the algorithm. Firstly, to improves its closeness to optimal answer in terms of delta and secondly the error probability with which it does not fall into the delta range. From a practical competitive programming standpoint, the algorithm is slightly redundant since it requires a lot of runs to reduces both these errors to enable it to pass on online judges. But still, its idea is super fascinating.</li>
<li>We discuss the FM algorithm, then FM+ which takes the average of a lot of instances of the FM algorithm.</li>
<li>This averaging of a lot of results is USEFUL in reducing the variance of the algorthm and thus making delta smaller. (This concept is somewhat more general in CS and equivalent to why people use random forests over decision trees in ML-algorithms to reduce variance in their results)</li>
<li>Then we make FM++ algorithm, which runs a lot of instances of FM+ and sort all the answers and returns the median. This is done because for the median to go bad (i.e lie in error region), at least more than HALF of the FM+ runs need to go bad. Since in FM+ we control the bad probability by some constant (ex. in lecture we used \(\frac{1}{4}\)). Now effectively the FM++ will fail only if more than half of the runs give a bad result. This we can see intuitively, decreases exponentially with the number of runs of FM+ we do. It is kind of like saying you have a coin which with probability $\frac{1}{4}$ gives HEAD and with remaining gives TAILS. Then what is the probability that more than 50% of the times in a run of \(K\) tosses it gives HEADS. This can be seen to decrease exponentially with \(K\) using Chernoff Bound.</li>
<li>The prof also told us that this “FIRST MEAN, THEN MEDIAN” technique is more of a general technique in randomized algos and also tested this in the midterm examination.</li>
<li>Additional References: <a href="https://en.wikipedia.org/wiki/Approximate_counting_algorithm">Increment counter algorithm, with a similar idea, tested for midterm</a></li>
</ul>
</li>
<li>Min Set Cover:
<ul>
<li>Model the problem as ILP (Integer Linear Programming) Problem. Then hand wave and say ILP is ALMOST Like LP. Use LP solver as a black-box. But wait…now the solutions are real numbers and not integers, so you use ROUNDING to get the results. Here rounding the number naively might not be good and the way you round your results is problem specific. In the case of Set Cover the prof showed us a specific rounding method which worked. Using that rounding method he showed that the algorithm gives a valid answer with probability \(1-\frac{1}{e}\), which is although reasonably high, but constant. NOTE - This is NOT the probability of being OPTIMAL, but instead of just giving a VALID SET COVER.</li>
<li>Now the interesting issue is that if you think we might be able to do something like Karger branching / FM++ and combine multiple runs of this algo to increase this probability then you are correct, HOWEVER, there is no method to merge the answers which keep the answer small. So what you do is RUN this instance \(\log{N}\) times and take the union of all the sets found, which increases the VALID SET COVER probability exponentially, thus making the algorithm work w.h.p (with high probability), however, the answer NO LONGER REMAINS CLOSE TO OPTIMAL. Instead, the union of \(\log{N}\) instances effectively makes the size of the set cover = \(\log{N}\) * (size of the optimal set cover).</li>
<li>An interesting thing to note here is that this shows you don’t get a linear approximation of min set cover (optimization version is in NP-Hard) by using Randomized Algorithms. You can get a logarithmic factor approximation of this problem with very high probability.</li>
</ul>
</li>
<li>Random Walks and Expert Learning:
<ul>
<li>All these techniques are fairly advanced as compared to the topics discussed above and the fact that we did not have problem sets on these (all these topics are post week7-week8) highlight the fact that some technicalities in these lectures were hand-waved OR not meant to be understood completely by an average Joe like me. So I don’t think I am in a spot to give any insights on these topics.</li>
</ul>
</li>
<li>Probabilistic Methods:
<ul>
<li>This semester the prof did not go through this topic, however, if you search on the internet it is a somewhat standard topic in many randomized algorithm courses. The problem that the probabilistic method tries to tackle is to comment on the existence / bound of a certain deterministic thingy using probability argument.</li>
<li>Example: Just above at point 1. We see that Karger’s algorithm shows us a side-result that the number of distinct min-cuts is bounded by \(N^{2}\), this is an example of a probabilistic method use case.</li>
<li>Another example that one can try out is to show that a 3-SAT problem with \(k\) clauses will have at least one solution instance which satisfies \(\geq \frac{7}{8}k\) clauses.</li>
<li>Additional References: <a href="http://web.cs.iastate.edu/~cs511/handout08/Approx_Max3SAT.pdf">MAX-3SAT Notes from another uni</a></li>
</ul>
</li>
</ol>
<p>Conclusion: The course structure is amazing and they teach a lot of good stuff. Prof. Seth Gilbert explains these algorithms very intuitively and you understand the feel of how the inequalities and math should work out after a few of his lectures. As for the grading component this semester, the module primarily consisted of Problem Sets, Midterm, Experimental Project and an Explanatory Paper on a randomized algorithm related research topic. The module does not have a final exam, so…the Random Walks and Expert Learning portion is NOT really graded anywhere.</p>
Wed, 10 Apr 2019 00:00:00 +0000
https://sidhantbansal.com/2019/CS5330-Randomized-Algorithm-Review/
https://sidhantbansal.com/2019/CS5330-Randomized-Algorithm-Review/My CP Journey - Part 1<p>Hey, so I initially wrote this draft regarding my journey in CP, to be posted by Codechef in one of their blogs, but that never took place. It was a while back that happened, so now I have polished that draft and added some more details to make it more complete.
This post has some Q/A at the end which was supposed to be there for that Codechef blog, I will leave the crux of the content untouched since I think it is my honest opinion and might help some people. Although do note the content of this post was written around a year back.</p>
<p>I started CP (Competitive Programming) somewhere around 8-9th grade, with the aid of one of the seniors in my school. I still remember the day when one of my senior (Yash Tewari) told me to first do the first 25 Project Euler questions and only then he would guide me ahead. It took me around a month to get through those first 25 questions. After getting through those first 25 questions, Yash helped me out to learn newer algorithms and data structures. It was a fun experience to be able to get help from and discuss questions with a senior. Later on, he helped me, to move to more proper judges such as Codechef, Codeforces, WCIPEG. He also introduced me to the book of Dr. Steven Halim, Competitive Programming 3. In retrospect, I think he was the only person who truly mentored me and I now realize how much patience he had when dealing with all my silly doubts. And yes, after volunteering and helping out some of my own juniors I can very well say that I have never been as patient with my juniors as he was with me. (<em>RESPECT</em>)</p>
<p><strong>9th grade</strong> - I took my first attempt at ZIO and was fortunate enough to be able to clear it but in the INOI, I couldn’t really solve anything and ended with a 10/200 score. I would say this was not a big surprise for me since I did not expect a very good result in the first place because I was not yet good at Dynamic Programming questions or Graph theory questions. On the other hand, Yash did make the cut and went to IOI-TC (<em>RESPECT</em>)</p>
<p><strong>10th grade</strong> - I had made good friends with Daksh, Rajat, Anubhav and Aditya (all from DPS-Dwarka) who were also prepping for INOI. We had a facebook messenger thread where we used to discuss a lot of problems and used to regularly participate in inter-school coding competitions as well as online random competitions. I think the internal competition among us was the primary thing that drove us all and made us grow as good competitive programmers.
Fortunately, most of us were able to crack INOI this time through in 2015. So, we went to IOI-TC 2015. This IOI-TC was not really a competition for us DPS-Dwarka students (since the top scorers were miles ahead of us), but more of like an eye-opener which made us realize that people at the camp are very good and we need to put in more work for future years, to get into the team of top 4. This was the first time I met the gifted people in the circuit like Malvika, Mriganka, Kushagra, etc (It has been 4 years since that time, and I have forgotten some names).</p>
<p><strong>Lesson</strong> - Still a long way to go, since I was nowhere near to crack to the top-4. Secured rank-13 but couldn’t solve any problem completely, so there was a good enough gap between me and the top students.</p>
<p>Do note up-till this point, i.e from 8th to 10th grade I did CP as a hobby. I could devote a lot of time to this since there was no expectation nor reason for me to put a lot of effort in regular academics, so these days were stress-free in general because of chill academics.</p>
<p><strong>11th grade</strong> - I would say this was the toughest year for me in school since on one side you have the pressure of normal academics and JEE. Unfortunately, I was enrolled in FIITJEE and during the first half of the 11th was in a bit of middle ground as to whether I should try to prepare for JEE or not. By the middle of 11th I had decided to mostly drop out of my FIITJEE classes and decided that I wasn’t cut out to prepare for JEE, mainly because it required being good in Physics and Chemistry along with Math and also because I found much more joy in practicing on Codeforces or some other judge than solve package modules. I am no way saying that this was a good decision in the long term thought of getting into a college since it could have backfired had I not been able to represent at IOI/not get selected at future IOI-TCs. But yes, this was one of the turning point decisions I made.</p>
<p><em>(Note - I believe that some students currently in 10th/11th/12th grade might see the thing done above as a kind of motivation to stop preparing for JEE or something. Do realise that this is just my experience where the end results went on to be good despite dropping out of FIITJEE and not taking JEE seriously, I do know people where it did not go ahead the way they wanted, so don’t take this as some kind of advice or anything, it just a narration of an experience not a suggestion of any sort.)</em></p>
<p>During the IOI-TC 2016, I had a bit of hard luck since I was able to stay in top-4 until the second day of the exam at the camp and was performing reasonably well. But on the third day, I wasn’t able to solve one of the doable questions and lost my spot on the team since my rank slid to rank-7. This year, fortunately, Rajat from our school did make it to the team and ended up securing a Silver Medal at the Olympiad (<em>RESPECT</em>). Also this year Sampriti made it to the top-4 in his first time at INOI as well as IOI-TC, it was pretty inspiring to see him succeed in a single attempt at IOI-TC (<em>RESPECT</em>).</p>
<p><strong>12th grade</strong> - I would say that the hardest part for me in the entire journey was to go to the IOI-TC next year knowing fully well, that if luck isn’t on my side this year I will flunk and then I would have practically never been able to represent at IOI despite being at IOI-TC thrice. Just the fear of failing this year was crazy enough to give me the feels what JEE aspirants must face when they write their exam :P
This time around at IOI-TC 2017, I got typhoid just before the training camp started. The entire process of diagnosing that I had typhoid and then starting my treatment at home took roughly 3-4 days and therefore I skipped most of the training camp except the last 3 days on which the exam is conducted. I was somewhat better when I landed in Chennai to give the exam for IOI-TC. My parents came along with me for this exam and I stayed with them in a nearby hotel from CMI, since my condition was still fragile. I didn’t really have high hopes since I knew that I was not in my best physical condition. But damn, I hate to brag, but yes magically I did fairly well on day 1 of the exam and ended up being 2nd on the rank-table. Day 1 was enough to boost my self-confidence and make me believe that I can very well maintain my position for the remaining two days and end up in the top 4 finally. And yes, I did end up in top-4. It was one hell of a ride.</p>
<p><strong>Lesson</strong> - You need to have a good amount of self-confidence no matter how much prepared/unprepared you are.</p>
<p>After the IOI-TC selection test, I went for a roughly 20 days camp at CMI, to be trained by Rajat. I went there along with Nalin, we didn’t really prep a lot and mostly had good food, but we had a good time.</p>
<p><em>Do note until this point I had attended a few camps conducted by Codechef which are as follows-</em></p>
<p><em>1. Gandhinagar Camp - Immediately before IOI 2017. Best camp I attended during my high school period. Since the teachers were pretty good (Kevin Sogo + Errichto / Kamil Debowski)</em></p>
<p><em>2. Snackdown Camp - Attended sometime when I was in 11th grade, conducted by Codechef, teachers mainly were (Arjun Arul + Kevin Sogo). Quite math based content and since I was still in 11th grade and not really an ICPC candidate, therefore I found most of the content not that much relevant to IOI-TC/IOI. But overall good high-quality camp.</em></p>
<p><em>3. Amrita Camp - Attended sometime when I was in 10th grade. Okayish in terms of quality. The accommodation was not that good, however, we were taught by quite good Indian college students (Arjun Arul + Surya Kiran + Ajay Krishna).</em></p>
<p><em>(Note - By this time I had attended some inter-school coding competitions, college coding competitions, SEARCC, Codechef Snackdown, etc. But all these competitions were just good competitions to practice at. They were never the goal themselves so I won’t be covering their experiences here. Obviously, I think all these competitions are quite good, it is just that I never really prepared separately for them.)</em></p>
<p>Cracking to the team for India more or less generally means that you are good enough/have a quite high probability to secure bronze at the IOI. After coming from the 20-day thing at CMI, we went to IOI-2017. It was a good event and I had a good performance on day 1, but a bad one on day 2. I ended up securing a bronze medal with a rank of somewhere around 87th. Although the second day did not go as I expected since I couldn’t get partials on the 2nd and 3rd problem, but still I was more or less content with my performance. Being at IOI you get to meet truly gifted students. One of the closest encounters, for me, would be Srijon. Since the IOI-TC is only 8 days long with 20 students, so you don’t really spend enough time interacting with all the students. It was during the IOI that I interacted with him my first time and boy I envy that he is so smart. I think IOI is a good reference point in making almost anyone realize that you can never be good enough and that someone is always there better than you. I think going to IOI made me a humbler person in general.</p>
<p>Now an interesting turn of events took place. Initially, after IOI-TC 2017, I had given an interview at IIIT-H and got admission into the 5year course through Olympiad quota. I won’t explain the exact details of the process here, you can google it up if you want. I had also applied to NUS during the summer but received a rejection somewhere around June/July.
The Olympiad itself was in August, and there Dr. Steven Halim, the coach of Singapore contingent (and faculty at NUS, and author of Competitive Programming 3) met me after the results were out and by sheer magic got my decision for NUS flipped to an acceptance. Damn, I was lucky. Perhaps the luckiest I have ever been. Since the session had already started at NUS, so I had to decide on whether to go or not within a short span of 3-4 days. After some discussion, me and my parents ended up deciding to opt for NUS over IIIT-H, primarily because we thought that the education and the opportunities I will be able to get at NUS would be more as compared to IIIT-H. Now, after being at NUS for roughly 2 years, I can say, we took the correct call and I was damn lucky to have interacted with Dr. Halim and will always be thankful to him :)</p>
<p><strong>Conclusion</strong> - Although competitive programming stands for doing programming as a sport and competitively but it was at IOI-TC and IOI that I understood that a part of competitive programming was about appreciating what others have achieved in terms of skills through their hard work, intellect, and passion. At places like these you get to meet smart people, really smart people and if you simply envy them for what they are, you will probably end up being salty, enraged and pissed of. It would be much wiser to just genuinely appreciate them.
The journey of high school CP ended here for me, I was more or less content with the eventual results. One bronze medal as a witness to 4 years of hard work from 9th grade to 12th grade. Not a bad deal, I would say. Although the next part of it is still due. Since now preparing for a completely different ball game (aka ACM-ICPC) started when I entered NUS. This blog post would become a pain to read if I included that also, so I will just make another blog post after the end of this semester titled, <strong>The Journey - Part 2</strong> for that.</p>
<p><em>(Note - In the entire blog post I didn’t really cover any technical things as to where did I used to practice, what were the portions in CP I was strong/weak at. I did not do this mainly because I am still evolving and do not have a concise answer. One of the things I can say right off the bat is that for practice I have mainly done Codeforce, Codechef, WCIPEG, Past IOI-TC questions, up-till my 12th grade)</em></p>
<hr />
<p>FAQs -</p>
<p>Take all the suggestions below with a pinch of salt, since neither I am the best person to advise nor should you put faith in a guy whom you came across online :)</p>
<p>Q) Is DPS Dwarka a good place for someone interested in CP?</p>
<p>A) I do not endorse my school in the above post and I think that the faculty and principal were not particularly helpful/supportive to students who did CP. The main benefit I had in my school, was the presence of the Computers Club (C.O.R.E) where we were able to interact with other students interested in the same field and the fact that we are able to participate in the Delhi based inter-school programming events. But apart from that my school hasn’t yet started recognizing students (supporting them in terms of leave of absence for preparation for the Olympiad/funding travel for inter-school events which are in other states) pursuing Olympiad preparation despite having more than 6 - 7 IOI-TC students and 2 medallists. I still believe it is a good school if one wants to prepare and excel in Boards / JEE, but I have found that a lot of students doing CP have this illusion that DPS Dwarka has some classes or something to train students in CP, since we had a good trend of IOI-TC students in the past 2 - 3 years. So I am clarifying what I think about my school so that someone making the decision to come to DPS Dwarka for this very purpose makes a more aware decision.</p>
<p>Q) Do I have any suggestions for people who wish to improve in CP(Originally written for the Codechef blog)?</p>
<p>A)</p>
<ol>
<li>Spending time thinking about problems is MORE VALUABLE than coding out the solutions.</li>
<li>A good strong mathematical background helps a LOT. (Example - Proficiency in permutation and combination, Induction, Probability, etc.)</li>
<li>In India, we sometimes focus a lot on data structure as compared to the reduction portion/problem-solving in our problems. I would suggest students give more weight to the problem-solving aspect as compared to strengthening their data structures. Since I believe that in the final rounds of IOI - TC / IOI the trend is that they generally don’t test advanced data structures as frequently as strong problem-solving skills. For this, if you are free you might even go to the extent of practicing for RMO/INMO in case you think that helps.</li>
<li>I think proving your solution/algorithm is underrated in CP. Going by your intuition for a given question is a good strategy in a contest time, but during practice, I believe that trying to prove your approach is essential. It strengthens the concept and will hopefully help to crack a harder problem in the future because of the insights gained during the attempt of proving.</li>
<li>If you are Delhi / Kolkata / …., region then you can find a good number of inter-school programming competitions happening around your locality. I would suggest people be involved in these events to gauge how they perform in pressure and against their peers. Still don’t consider these events as the best metric for your performance, since sometimes inter-school programming events don’t have good quality problems.</li>
<li>Make a group of good close friends with whom you can discuss problems at lengths. I have found this method very productive for my growth since it helps to know about interesting problems which my friends find. Also when you discuss, the discussion is not about the code it is about finding the approach and why is it correct. This discussion is often much more valuable than practicing more problems in my opinion.</li>
<li>Start early. I think that starting something like CP during your 10-11th grade can sometimes be hard on the student. Because he/she might face the dilemma of splitting their time for Academics (PCM / JEE / Boards) and CP. Starting out in 8-9th grade gave me an edge over others, because of 2 reasons. Firstly, experience. I gained a lot of experience in onsite contests because I have appeared in so many. Secondly, strengthening of the concept. When you read about a certain concept on blogs/editorials, the time it takes to digest it can vary from person to person. But if you start out early, you are not on a clock to do stuff hurriedly, you can take a lot of time for things you find difficult</li>
<li>Please don’t make fake accounts. In India, amongst the school level CP student groups, since the last 2 - 3 years there has been a trend of students making fake accounts for practicing on online judges. Example - Let us say that I make an account named “ xyz” on Codechef and start doing problems from this account now. People do this because of the competitive environment. The reason behind doing this is so that others do not get to know what problems they are doing. But I think it is morally incorrect if one person has a fake account and another doesn’t. Why? Label the person using the fake account as A and the one with a normal account as B. Here A has the advantage to see what B is doing in practice, but B is deprived of this. Secondly, during live contests as well A can see how B performs but B is unaware if A participated or not. This is a typical example of a prisoner dilemma. Also, if all the students make up their mind to use fake accounts, then another problem comes up which is that then technically no one knows what other people are doing. In a case like this, let us say that A becomes a good CP competitor and now his juniors look at his account to see what kind of problems should they attempt or practice. They would not have a clue as to what to do since from his original account A would not have made any submissions. This basically takes us a step back in making the CP community good in our locality/country. I have seen people doing this first-hand and I personally hate this practice and think it is quite unethical.</li>
</ol>
<p>Q) Has doing CP helped you in your career/academic pursuits (Originally written for the Codechef blog)?</p>
<p>A) Yes, I think CP has helped me a lot in my career/academic pursuits. As for academics, I think it improved my problem-solving skill in math. And now being in college, I am not hesitant in taking good algorithms related courses, in fact, it is going to end up being my specialization area in my undergrad. Apart from that, it helped me get into a college since I did not really prepare for JEE nor did I have a strong profile for abroad US universities. One thing I would like to mention however is, that if you want to CP then do it for the fun, do it for the glory, do it for the prestige of IOI. Yes, it will help you in college admissions in India (CMI and IIIT-H recognize it up to a certain extent) and abroad, but if that is your sole aim, then I don’t think you would have a strong enough drive to actually crack through the entire process. Because you will fail somewhere when you have 4 rounds of screening, ZIO / ZCO then INOI, then IOITC and finally IOI. I believe that the motivation rooting from the passion for the subject and from the prestige of IOI is far greater than the (greed/fear) of (getting/not getting) into a good (Indian/abroad) university.
I think the most prominent example following this would be Rajat. One of the best things about his work ethic was that he did the Olympiad because of the love/obsession he had towards problem-solving. If you have ever interacted with him, you will realize how little did it matter to him about how doing CP would be a profit/loss statement for him in terms of career/academic prospects.
So I would conclude, that yes it does help you in your academic pursuits, but see it as a byproduct of doing the hard work and not as the goal. Because if seen as the goal, then I believe we will slowly transform the Olympiad into a JEE of some kind, where people do it to get into a good college. And frankly, I think one JEE exam in India is enough, we don’t need another.</p>
Sat, 09 Feb 2019 00:00:00 +0000
https://sidhantbansal.com/2019/My-Cp-Journey-Part-1/
https://sidhantbansal.com/2019/My-Cp-Journey-Part-1/