One Point Solution

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Sunday, 28 July 2013

SRM 586 division 2 editorial preview

Posted on 20:16 by Unknown

Another day, another editorial preview. Here are the division 2 editorials for SRM 586: https://apps.topcoder.com/wiki/display/tc/SRM+586

SRM 586 introduced python support in TopCoder. I really wanted to give it a try and use it in editorials. The small issue is that their wiki system doesn't have python syntax highlighting. It actually doesn't have c++ highlighting either , and I was using the Java one. The Java one is not so great either, it highlights keywords inside comments.

I have had to include yet another javascript library to what is included by the wiki. Because I like pretty-looking formulas that I can type by hand, it already included mathjax. Because of my obsession to use SVG images and the need for a png fallback, they eventually started to use Modernizr as well. Now the slippery slope is even worse and I decided to add syntax highlighter. It is much better than TC's wiki highlighter (actually JIRA), but my editorials are starting to have too many heavy requirements, I have many mixed feelings.

Anyway, here is one of the problems for fun:

The hard problem - string weight.

The weight of a string is defined as follows: For each letter present in the string, find L and R, the minimum and maximum positions of the letter. The weight is equal to the sum of all R - L for each of the letters.

Given L `(1 <= L <= 1000)`, count the number of strings of minimal weight of that length.

What makes a minimum-weight string?

Before we get into counting these strings, we should understand their properties.

Like the first two examples show. In cases with small L, there are ways to make letters contribute a flat 0 to the weight. This is the smallest possible contribution. By definition: `(R >= L)` therefore: `(R - L >= 0)`. Since we want to minimize the total weight, we should try to make most if not all letters contribute 0.

It is possible to make all letters cost 0. Simply make all letters in the string unique: "acwh". Then for each letter: `(R = L)`. However, we only have 26 available letters. Therefore, we can only make each letter in the string unique when `(L <= 26)`. In that case, we have to use `L` different letters.

When `(L > 26)`, things will change. Let us begin with the smallest of such cases: `(L = 27)`:

  • If all letters were equal, then for this letter, `(R = 26)` and `(L = 0)`. The total weight is 26.
  • Perhaps it is better to have as many cost-0 letters as possible? As many letters that appear only once. If we use 25 unique letters, there are two remaining positions that must be used by a repeated letter. We are in a interesting situation: 25 letters contribute exactly 0, regardless of their position and we have two positions used by the same letter. We can pick L and R as the two positions used by this letter. The final weight will be `R-L`, so we want them to be as close as possible, let us make them consecutive , this means that the cost is 1. As long as 25 letters appear exactly once and the remaining letter appears twice but in consecutive positions. The minimum string weight is 1.

But how about larger cases? For L = 40. We can try the same logic, make 25 letters appear exactly once and a final letter appear exactly 15 times. The positions of the 25 letters do not matter, so we only care about L and R for the letter that appears 15 times. We should make the difference `(R - L)` as small as possible. Once again, it is most convenient to make the 15 letters appear consecutively. As long as we do this, the result will be: 14. If L is the first position, then the 15 positions of the letter will be: `(L, L+1, L+2, ... , L+13, R = L+14)`.

But we should be critical, although this strategy seems to minimize the total weight, is it the only strategy that manages to do it? In the previous idea, we had 25 letters that appear exactly once and one letter that appears 15 times. However, what about a small change? What about having 24 letters that appear once, 1 letter that appears twice and one letter that appears 14 times? The sum is still 40. Once again, we can ignore the positions of the 24 letters (0 weight). The positions of the other letters are also not important as long as the repetitions are consecutive. If we make sure they are consecutive, the cost of the letter that appears twice will be 1: `(L, R = L+1)`. For the other letter, the cost is 13: `(L, L+1, L+2, ..., L+12, R = L+13)`. The total weight is interestingly, 14: 13 + 1. This is also the weight we assumed to be minimum!

It seems that making sure 25 letters appear exactly once is not necessary to minimize the weight. However, making sure that repetitions are always consecutive still seems important. Let us put some detail.

Assume that the only thing we know about a sequence is that it has 26 different letters and repetitions are always consecutive. Let us calculate the weight of the string. Let us say the first letter appears `K_1` times, then it will fill positions `(0, 1, ... , K_1-2, K_1 - 1)`. It will contribute `(K_1 - 1)` to the weight. The second letter appears `K_2` times: `(K_1, K_1+1, ... , K_1 + K_2 - 2 , K_1 + K_2 - 1)` and therefore contributes `(K_2 - 1)` to the weight. And so and so. The total weight will be: `(K_1 - 1) + (K_2 - 1) + ... + (K_26 - 1)`. The total amount of letters is `L` , therefore: `(K_1 + K_2 + ... + K_26 = L)`. If we tweak things, you can find that the total weight will be: `(L - 26)`. This formula does not depend on the actual number of times each letter appears in the string. And it seems to be the minimum. This also shows that not using all 26 available letters will only yield a higher weight. E.g: if we use only 20 of the available letters the weight will be: `(L - 20)`.

In order to show that this is the minimum, let us prove that not keeping the repetitions consecutive will always yield a larger weight. If we wish to place `K` letters `a` in the string, making them consecutive will make it contribute `(K-1)` to the weight. If they are not consecutive, then there is at least one other letter `b` that will be between the two extremes of the letter in question. This will increase the weight contributed by letter `a`. The weight contributed by letter `b` will either stay equal (if all its instances are consecutive and between the `a` letters) or also worsen. This means that keeping all repetitions consecutive will yield the minimum weight. That this weight is `(L - 26)` and that any other strategy will increase the weight.

How to count the minimum-weight strings

Simple case

If `(L <= 26)`, then each letter in the string must be unique. So what is the cost of making a string of `L` unique letters, where there are 26 available letters in the alphabet? For the first position, we have 26 options. The second position has 25 options (we cannot use the same letter as the first position). Third position has 24 options and so and so. Therefore, the number of strings is:

`26 * 25 * 24 * ... * (26 - L + 1)`

L > 26

In this case, we need to count the number of strings that follow some properties:

  • Each letter in the alphabet (26 options) must appear at least once in the string.
  • All the repetitions in the string must be consecutive.

How can we count the number of strings?

Dynamic programming

Let us define `f(a, L)` as the number of ways to build a string of length `L` that follows those properties using an alphabet of `a` different letters.

A base case: `(L = 0)`, all the `a` letters must be inserted at least once. If `(a > 0)`, then there is a letter in the alphabet that we cannot use, because there are 0 spaces left, the result is 0. If `(a = 0)`, then there are no more letters to include and no spaces, the total numbers to do it is 1.

In another case, `(L > 0)` and `(a = 1)`, there is only one letter available, we better fill all the `L` positions with it. There is only one way to do this.

When `(L > 0)` and `(a > 1)` , then we have `a` choices for the letter to use in the first positions of the string. The number of those positions can be any `i` : `(1 <= i <= L)`. After using this letter in the first `i` positions, we still need to fill the remaining positions. There are `L-i` remaining positions and we cannot use the letter that we already used. So the partial result is: `a * f(a - 1, L - i)`. We should add together the results for each i, so:

`f(a > 1, L > 0) = sum_(i = 1)^(i <= L)( a * f(a - 1, L - i) )`

There are 27 possible values for `a` and `O(L)` values for `L`. Each time we call the function, we might need `O(L)` steps to perform the sum. If we use dynamic programmingor simple memoization to ensure that each argument combination for the function is evaluated at most once, then the algorithmic complexity will be: `O(A * L * L)` where `A` is the size of the alphabet, 26 in this case. This complexity is low enough for the time limit.


static const int MOD = 1000000009;

long dp[27][1001];
long onceConsecutive(int a, int L)
{
long & res = dp[a][L];
if (res == -1) {
if (L == 0) { // base case
res = (a == 0);
} else if (a == 1) {
res = 1;
} else {
// calculate the sum. (We need to use modular arithmetic)
res = 0;
for (int i=1; i<=L; i++) {
res += (a * onceConsecutive(a-1, L-i) ) % MOD;
}
res %= MOD;
}
}
return res;
}

int countMinimums(int L)
{
memset(dp, -1, sizeof(dp));
long res = 1;
if ( L <= 26 ) {
// The simple case:
//26 * 25 * 24 * ...
for (int i=0; i < L; i++) {
res = (res * (26 - i)) % MOD;
}
} else {
// each letter must appear at least once, all instances
// of each letter must be consecutive.
res = onceConsecutive(26, L); //Use the recurrence relation
}

return res;
}

Math

An alternative solution. We will separate the decision in two parts: a) The order in which the 26 letters are used and b) The number of positions each letter (In the order picked by a) ) will take in the string.

The order of letters is a permutation. The total number of ways to pick the order is: `26!`. Where `n!` is notation for the factorial of `n`.

Now that the order of the letters is picked, we decide the number of positions each letter will take. The total number of positions is `L` . How many solutions are there to the equation: `(K_1 + K_2 + ... K_25 + K_26 = L)` where each `K_i > 0`.

As you will see later, that question is easier to solve when `K_i >= 0`. We can do the conversion though Let `r_i = K_i - 1`, each `r_i` is the additional number of times the letter is picked.

`K_1 + K_2 + ... + K_25 + K_26 = L`
`r_1 + 1 + r_2 + 1 ... + r_25 + 1 + r_26 + 1 = L`
`r_1 + r_2 + ... + r_25 + r_26 = L - 26`

Since each `r_i` can be 0, we can use the following trick: We have to place `(L - 26)` stones in 26 boxes. We are able to leave some boxes empty. We can picture this as placing separators between some of the stones. For example:

oooo|ooo|oo||oo|o

Is the same as placing 4 stones in the first box, 3 stones in the second, 2 stones in the third, 0 stones in the fourth, 2 stones in the fifth box and 1 stone in the sixth box.

For 26 boxes, there will be 25 separators. How many ways to combine the separators and stones exist? There are `(25 + L - 26)` positions of which we pick 25 for the separators. The value is: `C(L - 1, 25)`, where `C()` is the binomial coefficient

.

The final result is therefore: `26! * C(L - 1, 25)`.


# define function f() as the factorial:
from math import factorial as f

def C(n, k):
return f(n) / f(k) / f(n - k)

#Calculates weight without the modulo
def Weight(L):
if L <= 26:
# 26 * 25 * 24 * ...
# the same as saying 26! / ( (26 - L)! )
return f(26) / f(26 - L)
else:
# 26! * C(L-1, 25)
return f(26) * C(L - 1, 25)

class StringWeightDiv2:
def countMinimums(self, L):
return Weight(L) % 1000000009 # Takes the remainder from the result
Email ThisBlogThis!Share to XShare to Facebook
Posted in editorial, topcoder | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

  • TopCoder SRM 557 - finally
    SRM 557 Explanation for division 1 Easy and match recap. Explanations for div2 easy and div2 medium. It feels like it has been ages since t...
  • SRM 589 Editorial
    I have finished writing the editorial for TopCoder SRM 589: http://apps.topcoder.com/wiki/display/tc/SRM+589 . As you most likely noticed. L...
  • SRM 590 recap and editorial
    Another week another Topcoder match. Not a great day. I had a bad flu and still do. Div1 500: The one with Xor Given a list of cards with nu...
  • SRM 546: relief
    I figured I should post something about this SRM. I've been very busy these weeks because the semester is ending and I tried to win a t-...
  • SRM 526: The killing wait for results
    While I wait for results, here is my perspective on this algorithm contest. It began with issues, it had to be postponed 15 minutes. TC has ...
  • SRM 554 div1 hard: TheBrickTowerHardDivOne
    Link to problem statement We got infinitely many bricks of dimensions 1x1x1 and C different colors. Count the number of towers of size 2x2...
  • SRM 533: Div1 500 MagicBoard explanation
    Finally solved it. It is a nice problem that is worth explaining in a post. You have a grid/board of at most 50x50 cells. Some cells contain...
  • Member SRM 505: Part 1
    So, let me explain a couple of problems from a Topcoder Member SRM that I wrote and never got an editorial. BTW, it was the last member SRM....
  • ListedLinks 2012-02-10
    Saturday Morning Breakfast Cereal comics: Grace Hopper's ghost That Oracle engineer blog post Oracle would really not like anyone to se...
  • Codeforces "Good bye 2013" round
    So it was a special round for coders of both divisions, problems ranged from the super easy problem A to the super difficult problems E,F,G....

Categories

  • acm
  • algorithm
  • answers
  • arenaplugin
  • badday
  • behindthescenes
  • bugs
  • c++
  • censorship
  • codechef
  • codeforces
  • contests
  • crocchamp
  • editorial
  • editorial.srm
  • embarrassing
  • explanation
  • gcj2013
  • gmp
  • goodday
  • google
  • googlecodejam
  • greed
  • groklaw
  • health
  • html
  • httpseverywhere
  • implementation
  • ipsc
  • ispc
  • java
  • kawigiedit
  • kindagoodday
  • lamebook
  • languages
  • lego
  • listedlinks
  • marathon
  • nasa
  • offtopic
  • ouch
  • postmortem
  • postportem
  • practical
  • probably_not_a_good_tip
  • problemsetting
  • programming
  • python
  • quora
  • rant
  • recap
  • slightlygoodday
  • snippet
  • srm
  • stl
  • strategy
  • swerc
  • tco
  • tco12
  • tco13
  • tco2012
  • tco2013
  • ternarysearch
  • topcoder
  • tricks
  • ubuntu
  • uva
  • vjass
  • vkcup
  • wc3
  • zinc

Blog Archive

  • ►  2014 (1)
    • ►  January (1)
  • ▼  2013 (141)
    • ►  December (14)
    • ►  November (8)
    • ►  October (13)
    • ►  September (11)
    • ►  August (14)
    • ▼  July (15)
      • KawigiEdit update 2.2.1
      • KawigiEdit update 2.2.0
      • SRM 586 division 2 editorial preview
      • New version of KawigiEdit-pf
      • SRM 585 Editorial for EnclosingTriangle
      • SRM 586 : Meh
      • My TopCoder coding setup (update 4)
      • In which I tweak KawigiEdit
      • "Topcoder Arena doesn't work in openJDK 7"
      • SRM 585 Recap and Editorial preview
      • Editorial for TCO round 3B is ready
      • Editorial for TopCoder Open round 3B ToastJumping
      • More about c++11 - tuples, tie and make_tuple
      • Today's TopCoder Test SRM
      • TopCoder Test SRM, new c++ features
    • ►  June (13)
    • ►  May (13)
    • ►  April (12)
    • ►  March (11)
    • ►  February (11)
    • ►  January (6)
  • ►  2012 (94)
    • ►  December (5)
    • ►  October (6)
    • ►  September (8)
    • ►  August (6)
    • ►  July (3)
    • ►  June (5)
    • ►  May (8)
    • ►  April (10)
    • ►  March (20)
    • ►  February (16)
    • ►  January (7)
  • ►  2011 (51)
    • ►  December (7)
    • ►  November (12)
    • ►  October (5)
    • ►  September (1)
    • ►  August (3)
    • ►  July (4)
    • ►  June (3)
    • ►  May (7)
    • ►  April (3)
    • ►  March (2)
    • ►  February (1)
    • ►  January (3)
  • ►  2010 (9)
    • ►  December (4)
    • ►  October (1)
    • ►  June (1)
    • ►  May (1)
    • ►  January (2)
  • ►  2009 (1)
    • ►  December (1)
Powered by Blogger.

About Me

Unknown
View my complete profile