Archive | August, 2011

Problem Eight

31 Aug

Here’s problem eight.

Question: Find the greatest product of five consecutive digits in the 1000-digit number. (a very large number follows)

This problem is fairly straightforward. I initially converted the number into a list, but then I realized that strings are also iterable! This made the code slightly simpler.

Basically we pull substrings out of the main string (the holds the big number), and find the product of its digits. If the product is greater than the previously largest product, we overwrite it.

So here it is:

n = '73167176531330624919225119674426574742355349194934\
96983520312774506326239578318016984801869478851843\
85861560789112949495459501737958331952853208805511\
12540698747158523863050715693290963295227443043557\
66896648950445244523161731856403098711121722383113\
62229893423380308135336276614282806444486645238749\
30358907296290491560440772390713810515859307960866\
70172427121883998797908792274921901699720888093776\
65727333001053367881220235421809751254540594752243\
52584907711670556013604839586446706324415722155397\
53697817977846174064955149290862569321978468622482\
83972241375657056057490261407972968652414535100474\
82166370484403199890008895243450658541227588666881\
16427171479924442928230863465674813919123162824586\
17866458359124566529476545682848912883142607690042\
24219022671055626321111109370544217506941658960408\
07198403850962455444362981230987879927244284909188\
84580156166097919133875499200524063689912560717606\
05886116467109405077541002256983155200055935729725\
71636269561882670428252483600823257530420752963450' 

result = 0 
factors = [] 

for x in range(0, 996): # tests from n[0] to n[995]
	product = 1
	digits = n[x:x+5] # digits is a substring of 5 consecutive digits in n
	
	for d in digits:
		product *= int(d)

	if product > result: 
		result = product
		factors = list(digits) 

print(result, factors)

The number is stored as a string in the variable n. We initialize two variables, result, which will be the largest product, and factors, a list that will hold the factors of that number (just for fun). Since we are testing numbers in 5-digit-long “chunks” (substrings), the last “chunk” we need to test is the last five digits, or the positions n[995] to n[999]. So we set a for loop to do this.

The product is set to 1 every time we iterate, and digits is the substring starting at position x and ending (not including) at position x + 5. This 5-digit substring is stored in digit and we iterate through it, multiply the product by each of its digits. We then check to see if the product is greater than the previously largest product (initialized to zero), and if true, we overwrite it and store the digits as a list in factors.

When it finishes we print the result.

Answer: 40824 (9*9*8*7*9)

Problem Seven

30 Aug

On to problem seven.

Question: What is the 10,001st prime number?

This problem was more tricky that I had first expected. I came up with a solution after an hour of tinkering around, but it took around 110 seconds to run (much too long). I probably should’ve done some research for calculating primes beforehand though, since there was a simple trick that made the result 100 times faster (it finished less than a second!). I’ll discuss this theorem/postulate later.

This was also the first problem I had to write a new function for, in order to keep it somewhat readable.

Let’s think how to solve this problem. We’ll need some way to determine if a number is prime, and also a counter to keep track of when we reach the 10,001st prime. Since we don’t know how many times we need to iterate, we use a while true loop.

Here’s the code:

def isPrime(x): 
	
	divisor = 3
	limit = int(x**0.5) + 1 # the highest factor we need to test is the square root of the number

	while divisor < limit: 
		if x % divisor == 0:
			return False
		else:
			divisor += 2 

	return True 

counter = 1 # keeps track of number of primes, 3 is second
i = 3 # start testing at 3 

while counter != 10001: 
	if isPrime(i):
		counter += 1 

	i += 2 #iterates over even numbers

print(i - 2)

Let’s look from line 14-15 first. We start the counter at one rather than zero, since we don’t include two as a prime, so three will count as the second prime we find. We keep track of the current number we’re on with the variable i. Moving down, we see a while loop that runs as long as the counter is under 10001, meaning that we haven’t found the 10001st prime yet. Inside the loop, we check if i is a prime using the function isPrime(), and if it is, we add one to counter. If not, we add two to i and try again. The reason for adding two is that every other number is even, which will not be a prime, so we can completely skip calculating those.

When it’s done running, we print the curious line i – 2. Why? Because when we find the 10001st prime and counter is set to 10001, before the loop stops running, we add two to i, or the current number (which is the 10001st prime). Printing i – 2 is simpler than dealing with this using if statements.

So now let’s take a look at the bulk of the code, the isPrime() function.

We set the initial divisor to three, since anything will divide into one and we already sifted out the even numbers. We have an upper limit that’s a bit esoteric, which I will explain in a second. Inside the while loop, we keep testing to see if the number we passed in as a parameter is divisible by the divisor. We increment the divisor every time until it reaches a limit, and then it stops. If the number is divisible by any divisor under this limit, we know that it’s not prime and it returns False. If it’s not divisible by any of the divisors under the limit, then the while loop ends and we return True (it’s a prime).

Now for the limit variable. Initially I used no (defined) limit, and I would test all the numbers below the parameter to see if it was a factor. However, this was very slow and made the code run for more than 100 seconds. In the answers forum, I saw that someone had used a similar technique, but with the square root of the number as a limit. I googled this and found the mathematical reason. Basically, factors of a number come in pairs. One will be smaller, and the other will be larger. The only time when the two factors will be equal are when they are square roots of that number. So essentially, the largest factor of a number you need to worry about is the square root of that number. Any factor higher than that will have a corresponding lower factor, which has already been ruled out. That’s why the limit is set to the square root of the parameter (plus one).

Answer: 104743

Problem Six

30 Aug

Next is problem six:

Question: Find the difference between the sum of the squares of the first one hundred natural numbers and the square of the sum.

This problem is pretty straightforward, and doesn’t really require much more effort than a for loop.

Here’s the code:

thesum = 0 # the sum of natural numbers under 100 (we square it later)
sqrsum = 0 # the sum of the square of the natural numbers over 100

for i in range(1, 101):
	thesum += i
	sqrsum += i**2

print(sqrsum - (thesum**2)) 

I think the most confusing thing here is my choice of variable names: thesum stands for the sum of all natural numbers under 100, and sqrsum stands for the sum of the squares of those numbers. We iterate though this loop 100 times, and print the answer.

Note that the result of this code is negative, but the solution for the problem is positive. It’s just a matter of the subtraction order.

Here’s a streamlined version of the above code (similar to problem one’s succinct version):

print(sum([x**2 for x in range(1, 101)]) - sum([x for x in range(1, 101)])**2)

Again, we’re just creating a list of integers from 1-100, and adding them up with the sum() function.

Answer: 25164150

Problem Five

29 Aug

Here’s problem five:

Question: What is the smallest positive number that is evenly divisible by all of the numbers from 1 to 20?

My first solution was to brute force it, so it would test every number upwards from twenty for divisibility, but that was pretty inefficient. It took around 100 seconds to run, since the answer is in the 200,000,000s. So I thought about the problem some more. Since it was given that 2520 is the LCM by 1-10, the answer must be a multiple of 2520 (by definition, the LCM of 1-20 must be divisible by the numbers 1-10, and therefore must be divisible by the LCM of 1-10). So for each iteration we only need to test the next multiple of 2520 for divisibility by 11-20, since we know that 1-10 will always be factors.

This speeds up the code significantly:

run = True
i = 252

while run:

	# all() -- if every element in the list is true
	if all(i % n == 0 for n in range(11, 21)):
		print(i)
		run = False

	else:
		i += 2520

The while loop is set to run as long as the run variable is true. When we find the solution, run is set to false and it stops running. The current number is set to the variable i, which is initialized to 2520. If the current i is not divisible by 11-20, we add 2520 to it and repeat.

The divisibility testing might require some explanation. I initially wrote it as something along the lines of:

if i % 11 == 0 and i % 12 == 0 ... and i % 20 == 0

This was annoying to type, and it also looks pretty messy and amateurish. I played around with the code for a while, trying to maybe make the divisor iterate through a for loop, but I couldn’t come up with a way to make sure that it was divisible by all of them without writing it all out. So I consulted the programming geniuses at stackoverflow for help. (Here’s the post: http://stackoverflow.com/questions/7225168/testing-divisibility-by-multiple-numbers)

They suggested I use a function that I didn’t even know existed in Python (maybe I should go review them…), called all(). Basically this tests if each value in the list is true, and only then returns true. Exactly what I was looking for. So that’s how line 6 came to be.

Answer: 232792560

Problem Four

27 Aug

On to problem four.

Question: Find the largest palindrome made from the product of two 3-digit numbers.

Again, brute force was my solution to the problem. Essentially I worked my way down from 999-100, and multiplied that by the numbers from 999-100 (i.e. 999 * 999, 999 * 998, 999 * 997…100 * 999, 100 * 998, 100 * 997). There are some overlaps in this method, such as 999 * 500 and 500 * 999, but for the sake of simplicity I overlooked this inefficiency. Also, there’s no specific reason that I counted down from 999 except for the fact that I wanted to practice decrementing a loop. It would work just as well counting upwards from 100 to 999. Testing to see if it’s a palindrome is a simple matter of string manipulation. A larger palindrome will overwrite the previously stored palindrome.

Here’s the code:

# tuple to store the largest palindrome, and its two factors
palindrome = (0, 0, 0)

for i in range(999, 99, -1): # counts down just for the fun of it
	
	for j in range(999, 99, -1):

		product = str(i * j)
		print(product, product[::-1], i, "*", j) 
	
		if product == product[::-1]: # if palindrome
			
			if int(product) > palindrome[0]: # if current palindrome is bigger than the last
				palindrome = (int(product), i, j) # assign it and its two factors to the tuple 

print(palindrome)

There is some “excess” code in this that’s not directly used in determining the solution, such as storing the palindrome and its factors as a tuple (and printing each line), but I decided to keep it in there just to practice writing python.

Here’s the concise version (which runs significantly faster):

palindrome = 0

for i in range(999, 99, -1):	
	for j in range(999, 99, -1):

		product = str(i * j)

		if product == product[::-1]:
			if int(product) > palindrome:
				palindrome = int(product)

print(palindrome)

The enclosed for loops ensure that we test every possible three-digit by three-digit product. Since product is an integer, we convert it into a string and reverse it to test if it’s the same forward and backwards. If it is, we do another check to see if it’s larger than the previous largest palindrome (initialized to zero). When it’s done running, it prints the palindrome. The int()‘s make it possible to compare the two numbers as integers.

Let’s review the first, longer code now. Every time it tests two factors, it prints out a line in this format: “product reversed_product factor1 factor2”. You can delete this line to make the code many, many, many times faster (from around 2 minutes to 1.5 seconds). The palindrome is also stored as a tuple so it prints out the palindrome and its two factors at the end.

Mathematically, I don’t really know how to solve this. There’s a few solutions in the answers forum that are too long to paraphrase, but you can check there if you’re really interested.

Answer: 906609 (993 * 913)

Problem Three

25 Aug

Here’s problem three.

Question: What is the largest prime factor of the number 600851475143?

Ah, I must admit that I cheated a bit on this problem. My solution found the factors of that very large number, but not actually the prime factors. Simply by luck, the largest factor I found was a prime number, so it did work out in the end. However, a lot of people in the answers forum imported modules for primality testing, and one person even used a Mathematica function to list the prime factorization! So I guess to this end I’m cheating slightly less. 😛 (Correction to this statement at the bottom!)

So my solution to this problem was to test if a integer (incrementing every iteration) divided into the number cleanly. If true, I added that integer into a list called factors. I then divided the original number by that factor so that we could just put this code into a while loop.

Here’s the code: (d for divisor, facs for factors)

n = 600851475143
facs = [] # stores factors 
d = 2 # divisor must start at two

while d <= n:
	if n % d == 0: # if divisible, store factor in list and divide n by that factor
		facs.append(d)
		n /= d
		d = 2
	elif d == n: # last factor will be equal, so add to list and break
		facs.append(d)
		break
	else:
		d += 1 # if not divisible, increment divisor by one and test again

print(facs)

The number 600851475143 is stored in n, facs is an empty list, and d is set to 2 (division by zero if set to zero and infinite loop if set to one). Inside the while loop, it tests two things. If n can be divided by d (the current integer), it adds d to the list and sets n to the quotient of n and d. It also resets d (the divisor) to 2, since we are going to be testing a new number to divide in the next iteration. If it’s not divisible (and they are not equal), it increments d by one and repeats. When it reaches the last factor, d and n will be equal, so we add d the the list and break out of the loop. Finally we print the list.

So how would you solve this mathematically? I definitely would not like to, since the only way that I envision you could solve this is by brute force or some complex algorithm. I did google some mathematical algorithms initially to see if I could directly convert the math into python-speak, but they were almost all rooted in mathematical concepts beyond what I’ve learned, so I couldn’t really implement any of them. Some of these include the Pollard-Strassen method, the general number field sieve, and elliptic curve primality testing.

On the WolframMathworld website, I learned that this method is called direct search factorization, essentially brute forcing the solution by trial division. It states that this implementation is extremely inefficient and “simple-minded”. Ouch.

http://mathworld.wolfram.com/DirectSearchFactorization.html

Answer: 6857

Correction: I initially thought that my algorithm did not give prime factors, but on writing this post, I realized that it actually did. My solution was more elegant and complex that I had even percieved (ego boost)! Well, not really elegant (according to mr. wolfram), but it did do something I had not predicted. Guess my prescience could use a little work.

Thinking about it, the code tests for a factor by starting small and gradually getting larger. So by the time it reaches the large non-prime factors, it will have already found and eliminated the smaller primes (2, 3, 5, etc.) which are factors of these larger factors. My terminology is slightly arbitrary but I think you get the point. By the time it reaches a possible non-prime factor, it will have eliminated the smaller prime factors of that factor, so that particular number will not be counted as a factor. I think I might be just confusing you at this point, but Mathworld eloquently puts it as: “all possible factors (m) have had their cofactors already tested. …Therefore, m must be prime”.

Problem Two

25 Aug

Ok, let’s move on to problem two.

Question: By considering the terms in the Fibonacci sequence whose values do not exceed four million, find the sum of the even-valued terms.

Basically, we need to sum all the even fibonacci terms under four million. Let’s look holistically at how we would solve this problem. First, we need to have a function, or at least a procedure, to get the fibonacci terms. Then we need to do a test to check that it’s even, and if true, we add that term into some variable.

I did a quick google search to find some python code for the fibonacci sequence, and there were already a ton of solutions. However, it was pretty hard to intuitively understand (grok?) this code at first, even the simple ones, so I decided to try my hand at writing it myself. It took me about 30 minutes to come up with a clunky version of it, but I did pare it down by starting at one, instead of zero as I originally planned.

(Note: The projecteuler website states that the fibonacci sequence is “1, 2, 3, 5, 8, 13…”, but it technically is “0, 1, 1, 3, 5, 8…”. This does not affect our result though since we are only dealing with even numbers.)

From then on it was a quick manipulation to check the even-odd-ness (is there a word for that?) of the term, and then adding it into a sum variable. Here’s what the code looks like:

a = 0 
b = 1
total = 0

while b < 4000000:

	if b % 2 == 0:
		total += b

	a, b = b, b+a

print(total)

This code is pretty straightforward, but it might be a little hard to grasp at first. If you’re confused, think about it this way. Lines 1-3 initialize the variables. It’s important for a to equal zero and b to equal one. Lines 5 & 8 are essentially the code for the fibonacci sequence. I’ll explain line 8 in more detail later on, but now just take my word for it. Lines 6 & 7 check to see if the current term is even and adds it into total. Line 10 prints that variable.

So let’s examine line 8. This is the inner workings of the fibonacci sequence. Originally this code was three lines long, and looked like this:

c = b
b += a
a = c

Keep in mind that this is all line 8 (in the while loop), in its unshortened form. We are basically doing two things in the these lines: adding the previous term to the current term, and setting the current term to become the previous term in the next iteration. However, this requires a third variable c to work, to temporarily store the current value. I found a more efficient way of doing this in my earlier google searches, which looks like line 8 above.

Also duplicated here:

a, b = b, b+a

This essentially sets the two variables in one line, without the need of a third variable to temporarily store a value.

So that’s the code, but how would you solve it mathematically? Credit goes to RudyPenteado in the forum for coming up with this very clever solution. He realized that every third term of the fibonacci sequence is even. Additionally, phi, or φ, is the approximate ratio between consecutive terms in the fibonacci sequence. This fact can be manipulated so that the ratio between consequtive even terms is φ3, or 4.236068. The first even term is 2, so multiply that by 4.236068 and you get, after rounding to the nearest integer, 8. You repeat this to find the values up to four million (there’s less than ten terms), and add them with a calculator.

Answer: 4613732