- Looking for small projects
- Never applied for a job
- The rise and fall of binomial coefficients
- Need a 12-digit prime?
- Prove or disprove
- Extreme syntax
- Wooden cash register
- Synchronizing cicadas with Python
- Searching for Perrin pseudoprimes
- Looking in both directions
- NYT Book of Mathematics
- Efficiency vs. Robustness
- Asteroids can have moons
- Mutually odd functions
- Almost if and only if
- Ramanujan approximation for circumference of an ellipse
- Recognizing numbers
- More sides or more dice?
- Rolling dice for normal samples: Python version
- 2,000th post
- Hacking debt
- Bad normal approximation
- Why j for imaginary unit?
- Playful and purposeful, pure and applied
- Quotation and endorsement
- Moments of mixtures
- New Twitter accounts for DSP and music theory
- Social networks in fact and fiction
- Which Unicode characters can you depend on?
I'm looking for small consulting projects to fill the gaps between larger projects. I'm available for projects that would take up to a few days. I can't take on another large project right now. However, if your company takes several weeks to initiate a project, we could start the process now and I may be available by the time the paperwork is done. If you have a project you'd like to discuss, please let me know.
John Conway explained in an interview that he's never applied for an academic job. I am rather proud of the fact that, in some sense, I never applied for an academic position in my life. What happened: I was walking down King's Parade, the main street in Cambridge, after I received my Ph.D. The chairman of the mathematics department said, "Oh, Conway, what have you done about applying for jobs? And I said, "Nothing." "I thought that might be the answer," he said, "Well, we have a position in our department, and I think you should apply." And I said, "How do I apply?" And he said, "You write a letter to me." And I said, "What should I put in that letter?" Then he lost his patience, pulled out of his pocket a letter — which had been written on one side — turned it over, and scribbled "Dear Professor Cassels" — that was his name — "I wish to apply for dot-dot-dot." He handed it to me, and I signed it. It would have been nice if I'd gotten the job. I didn't that year, but I got the same job the next year with the same letter.
When you expand (_x_ + _y_)_n_, the coefficients increase then decrease. The largest coefficient is in the middle if _n_ is even; it's the two in the middle if _n_ is odd. For example, the coefficients for (1 + _x_)4 are 1, 4, 6, 4, 1 and the coefficients for (1 + _x_)5 are 1, 5, 10, 10, 5, 1. More generally, if _a_ > 0 and _b_ > 0, the coefficients of (_ax_ + _by_)_n_ can only have one maximum. They may increase, or decrease, or change direction once, but they cannot wiggle more than that. They can't, for example, increase, decrease, then increase again. Here's a proof. The coefficients are To show that the coefficients are unimodal as a function of _k_, we'll show that their logarithms are unimodal. And we'll do that by showing that they are samples from a concave function. The log of the _k_th coefficient is log Γ(_n_+1) - log Γ(_k_+1) - log Γ(_n_-_k_+1) + _k_ log _a_ + (_n_-_k_) log _b_. As a function of _k_, the terms log Γ(_n_+1) + _k_ log _a_ + (_n_-_k_) log _b_ form an affine function. The function log Γ_ _is convex, so -__log Γ_ _is concave. The composition of a concave function with an affine function is concave, so - log Γ(_k_+1) and - log Γ(_n_-_k_+1) are concave functions of _k_. The sum of concave functions is concave. And the sum of a concave function with an affine function is concave. So binomial coefficients are log-concave and they can only have one maximum. (The fact log Γ(_z_) is a convex is the easy direction of the Bohr-Mollerup theorem. The harder direction is that Γ(_z_) is the _only_ way to extend factorials to all reals that is log-convex.)
You may have seen the joke "Enter any 12-digit prime number to continue." I've seen it floating around as the punchline in several contexts. So what do you do if you need a 12-digit prime? Here's how to find the smallest one using Python. >>> from sympy import nextprime >>> nextprime(10**11) 100000000003L The function
nextprimegives the smallest prime larger than its argument. (Note that the smallest 12-digit number is 1011, not 1012. Great opportunity for an off-by-one mistake.) Optionally you can provide an addition argument to
nextprimeto get primes further down the list. For example, this gives the second prime larger than 1011. >>> nextprime(10**11, 2) 100000000019L What if you wanted the largest 12-digit prime rather than the smallest? >>> from sympy import prevprime >>> prevprime(10**12) 999999999989L Finally, suppose you want to know how many 12-digit primes there are. SymPy has a function
primepithat returns the number of primes less than its argument. Unfortunately, it fails for large arguments. It works for arguments as big as
2**27but throws a memory error for
2**28. The number of primes less than _n_ is approximately _n_ / log _n, _ so there are about 32 billion primes between 1011 and 1012. According to Wolfram Alpha, the exact number of 12-digit primes is 33,489,857,205. So if you try 12-digit numbers at random, your chances are about 1 in 30 of getting a prime. If you're clever enough to just pick odd numbers, your chances go up to 1 in 15.
From Concrete Mathematics:
Incidentally, when we're faced with a "prove or disprove," we're usually better off trying first to disprove with a counterexample, for two reasons: A disproof is potentially easier (we just need one counterexample); and nit-picking arouses our creative juices. Even if the given assertion is true, out search for a counterexample often leads us to a proof, as soon as we see why a counterexample is impossible. Besides, it's healthy to be skeptical.
In his book Let Over Lambda, Doug Hoyte says
Lisp is the result of taking syntax away, Perl is the result of taking syntax all the way.Lisp practically has no syntax. It simply has parenthesized expressions. This makes it very easy to start using the language. And above all, it makes it easy to treat code as data. Lisp macros are very powerful, and these macros are made possible by the fact that the language is simple to parse. Perl has complex syntax. Some people say it looks like line noise because its liberal use of non-alphanumeric characters as operators. Perl is not easy to parse — there's a saying that only Perl can parse Perl — nor is it easy to start using. But the language was designed for regular users, not beginners, because you spend more time using a language than learning it. There are reasons I no longer use Perl, but I don't object to the rich syntax. Saying Perl is hard to use because of its symbols is like saying Greek is hard to learn because it has a different alphabet. It takes years to master Greek, but you can learn the alphabet in a day. The alphabet is not the hard part. Symbols can make text more expressive. If you've ever tried to read mathematics from the 18th or 19th century, you'll see what I mean. Before the 20th century, math publications were very verbose. It might take a paragraph to say what would now be said in a single equation. In part this is because notation has developed and standardized over time. Also, it is now much easier to typeset the symbols someone would use in handwriting. Perl's repertoire of symbols is parsimonious compared to mathematics. I imagine that programming languages will gradually expand their range of symbols. People joke about how unreadable Perl code is, but I think a page of well-written Perl is easier to read than a page of well-written Lisp. At least the Perl is easier to scan: Lisp's typographical monotony makes it hard to skim for landmarks. One might argue that a page of Lisp can accomplish more than a page of Perl, and that may be true, but that's another topic. *** Any discussion of symbols and programming languages must mention APL. This language introduced a large number of new symbols and never gained wide acceptance. I don't know that much about APL, but I'll give my impression of why I don't think APL's failure is not proof that programmers won't use more symbols. APL required a special keyboard to input. That would no longer be necessary. APL also introduced a new programming model; the language would have been hard to adopt even without the special symbols. Finally, APL's symbols were completely unfamiliar and introduced all at once, unlike math notation that developed world-wide over centuries. *** What if programming notation were more like music notation? Music notation is predominately non-verbal, but people learn to read it fluently with a little training. And it expresses concurrency very easily. Or maybe programs could look more like choral music, a mixture of symbols and prose.
Suppose you want to know when your great-grandmother was born. You can't find the year recorded anywhere. But you did discover an undated letter from her father that mentions her birth and one curious detail: the 13-year and 17-year cicadas were swarming. You do a little research and find that the 13-year cicadas are supposed to come out next year, and that the 17-year cicadas came out last year. When was your great-grandmother born? Since 13 and 17 are relatively prime, the 13-year and 17-year cicadas will synchronize their schedules every 13 × 17 = 221 years. Suppose your great-grandmother was born _n_ years ago. The remainder when _n_ is divided by 13 must be 12, and the remainder when _n_ is divided by 17 must be 1. We have to solve the pair of congruences _n_ = 12 mod 13 and _n_ = 1 mod 17. The Chinese Remainder Theorem says that this pair of congruences has a solution, and the proof of the theorem suggests an algorithm for computing the solution. The Python library SymPy has a function for solving linear congruences. >>> from sympy.ntheory.modular import solve_congruence >>> solve_congruence( (12, 13), (1, 17) ) (103, 221) This says we're 103 years into the joint cycle of the 13-year and 17-year cicadas. So your great-grandmother was born 103 years ago. (The solution 324 = 103 + 221 is also mathematically possible, but not biologically possible.) You can use the same SymPy function to solve systems of more congruences. For example, when is the next year in which there will be summer Olympic games and the 13-year and and 17-year cicadas will swarm? Here are a couple ways to approach this. First, you could find the _last_ time this happened, then find when it will happen next. You'd need to solve _n_ = 1 mod 4 (since we had summer Olympics last year) and _n_ = 12 mod 13 and _n_ = 1 mod 17. >>> solve_congruence( (1, 4), (12, 13), (1, 17) ) (545, 884) So the cicadas and the summer Olympics were in sync 545 years ago. (Well, they would have been if the modern Olympics had existed in the middle ages.) This says they'll be in sync again in 885 - 545 = 339 years. Here's a more direct approach. We want to know when the summer Olympics will be 3 years ahead of where they are now in the cycle, when the 13-year cicadas will be 1 year ahead, and the 17-year cicadas will be 16 years ahead. >>> solve_congruence( (3, 4), (1, 13), (16, 17) ) (339, 884) By the way, you can use negative integers with the congruences, so you could have used (-1, 17) to say the 17-year cicadas will be 1 year back instead of 16 years ahead in their cycle.
A week ago I wrote about Perrin numbers, numbers _P__n_ defined by a recurrence relation similar to Fibonacci numbers. If _n_ is prime, _P__n_ mod _n_ = 0, and the converse is nearly always true. That is, if _P__n_ mod _n_ = 0, _n_ is USUALLY prime. The exceptions are called Perrin pseudoprimes. Matt McIrvin wrote an excellent post explaining how to compute Perrin pseudoprimes. Here I'm just going to elaborate on a couple points in his post. Matt's first point is that if you want to search for Perrin pseudoprimes, the most direct approach won't get you very far. The obvious thing to do is compute _P__n_ and then see whether it has remainder 0 when you divide by _n_. The problem is that _P__n_ grows exponentially with n. In fact, _P__n_ is approximately ρn where ρ = 1.3247… is the plastic number. This means that _P__n_ has about _n_ log10 ρ digits. So searching for pseudoprimes less than one billion would require working with numbers with over 100,000,000 digits. This can be done, but it's slow and unnecessary. Since the goal is to compute _P__n_ mod _n_ rather than _P__n_ _per se_, we can carry out all calculations mod _n_ and avoid extended precision arithmetic as long as _n_ itself can fit in an ordinary precision integer. If we want to find pseudoprimes less than one billion, we calculate _P__n_ mod _n_ for each _n_ up to _N_ = 109. This only requires ordinary arithmetic. However, this approach takes O(_N_2) time unless we're clever. We have to compute _P__n_ mod _n_ separately for each _n_, and the most direct approach takes _n_ steps. This leads to Matt's second point: use matrix multiplication (mod _n_) to calculate _P__n_ mod _n_. This requires calculating the _n_th power of a 3×3 matrix, which can be done in O(log _n_) time using fast exponentiation. This makes the search for pseudoprimes less than _N_ require O(_N_ log _N_) rather than O(_N_2) time. This is enough to make the search for pseudoprimes less than a billion practical.
From David Mumford's May 2013 interview in SIAM News:
The applied mathematician has the difficult job of looking at a problem in context with no explicit mathematics and trying to see what kinds of mathematical ideas are under the surface that could clarify the situation. I think the most successful applied mathematicians are those who LOOK IN BOTH DIRECTIONS, at the science and the math. You can't become too attached to one way of looking at things. Applied math has always rejuvenated pure, and theorems in pure math can unexpectedly lead to new tools with vast applications.Emphasis added. I wish Mumford had said "at the _problem_ and at the math" because not all applications of math are scientific. RELATED POSTS: Hard analysis, soft analysis Impure math
Dogfooding refers companies using their own software. According to Wikipedia,
In 1988, Microsoft manager Paul Maritz sent Brian Valentine, test manager for Microsoft LAN Manager, an email titled "Eating our own Dogfood", challenging him to increase internal usage of the company's product. From there, the usage of the term spread through the company.Dogfooding is a great idea, but it's no substitute for usability testing. I get the impression that some products, if they're tested at all, are tested by developers intimately familiar with how they're intended to be used. If your company is developing consumer software, it's not dogfooding if just the developers use it. It's dogfooding when people in sales and accounting use it. But that's still no substitute for getting people outside the company to use it. Dogfooding doesn't just apply to software development. Whenever I buy something with inscrutable assembly instructions, I wonder why the manufacturer didn't pay a couple people off the street to put the thing together on camera.
Today's newspaper may be interesting because it reports new information. Newspapers from decades ago may be interesting for different reasons, not for the explicit content but for the implicit content. What were the contemporary reactions to what's now well known? What were the readers expected to know and not know? To be impressed by? But the newspapers in between are not so interesting. They're not news and they're not history. The New York Times has just published their Book of Mathematics, a collection of over 100 math articles written from 1892 to 2010. Most of the articles are toward the 2010 end of the timeline. I found most of the articles to be old news but not old enough to be historically interesting. However, I'm a professional mathematician, and these articles were written for a popular audience. The intended audience for the original articles, as well as the new compilation, would probably enjoy the book more. On the other hand, these are newspaper articles: lots of text, no color, and few illustrations. People who read popular math books might have less patience for this book. One of the articles I did enjoy was "The Electronic Digital Computer: How It Started, How It Works and What It Does" from 1967. It's the longest article in the book at 20 pages, and goes into some depth. Of course parts of it are also quaint. I also enjoyed "A Soviet Discovery Rocks World of Mathematics" from 1979. It's jarring now to hear the adjective "Soviet" applied to a mathematical result, but I assume this was not remarkable at the time. The article is about Khachiyan's discovery of the first polynomial time algorithm for solving linear programming problems. The algorithm was impractical but groundbreaking. It quickly led to new algorithms that were efficient in theory and in practice. The article has a hint of panic between the lines, something like the reaction to Sputnik but to a lesser degree. *** Andrew Gelman wrote a short review of this book on his blog a few days ago. I put off reading his review until I had written my own above. His reaction was similar to mine:
… Fun for the math content and historical/nostalgia value. … I have too much of a technical bent to be the ideal reader for this sort of book, but it seems like an excellent gift for a non-technical reader who nonetheless enjoys math. … My own preference would have been … more old stuff.
Something is EFFICIENT if it performs optimally under ideal circumstances. Something is ROBUST if it performs pretty well under less than ideal circumstances. Life is full of trade-offs between efficiency and robustness. Over time, I've come to place more weight on the robustness side of the trade-off. I imagine that's typical. With more experience, you become more willing to concede that you haven't thought of everything that could happen. After a while, you notice that events with a "one in a million" chance of occurring happen much more often than predicted. ROBUST THINGS ARE OFTEN MORE EFFICIENT THAN EFFICIENT THINGS ARE ROBUST. That is, robust strategies often do fairly well under ideal circumstances. You may not give up much efficiency for robustness. But efficient strategies can fail spectacularly when the assumptions they're based on don't hold.
This afternoon my postman delivered a review copy of The Space Book by Jim Bell. This is the latest book in a series that includes Cliff Pickover's math, physics, and medical books. Like the other books in the series, The Space Book alternates one-page articles and full-page color images. Here's something I learned while skimming through the book: Asteroids can have moons. (That's the title of the article on page 414.) This has been known since the early 1990′s, but it's news to me. The first example discovered was a satellite now named Dactyl orbiting the asteroid 243 Ida. The Space Book says Dactyl was discovered in 1992. Wikipedia says Dactyl was photographed by the Galileo spacecraft in 1993 and discovered by examining the photos in February of 1994. Since that time, "more than 220 minor planet moons have been found."
The floor of a real number _x_ is the largest integer _n ≤ x_, written ⌊x⌋. The ceiling of a real number _x_ is the smallest integer _n ≥ x_, written ⌈x⌉. The floor and ceiling have the following symmetric relationship: ⌊-_x_⌋ = -⌈_x_⌉ ⌈-_x_⌉ = -⌊_x_⌋ The floor and ceiling functions are not odd, but as a pair they satisfy a generalized parity condition: _f_(-_x_) = -_g_(_x_) _g_(-_x_) = -_f_(_x_) If the functions f and g are equal, then each is an odd function. But in general f and g could be different, as with floor and ceiling. Is there an established name for this sort of relation? I thought of "mutually odd" because it reminds me of mutual recursion. Can you think of other examples of mutually odd functions? RELATED POSTS: Saved by symmetry Odd numbers in odd bases The power of parity
The Perrin numbers have a definition analogous to Fibonacci numbers. Define _P_0 = 3, _P_1 = 0, and _P_2 = 2. Then for _n_ > 2, define _P__n_+3 = _P__n_+1 + _P__n_+0. The Concrete Tetrahedron says
It appears that _n_ is prime "almost if and only if" _P__n_ mod _n_ = 0.The "only if" condition is true without qualification: if _n_ is prime, _P__n_ mod _n_ = 0. It's the "if" part that's almost true. When _P__n_ mod _n_ = 0, _n_ is USUALLY prime. Composite numbers that satisfy the Perrin condition _P__n_ mod _n_ = 0 are called Perrin pseudoprimes. The smallest Perrin pseudoprime is 271,441. The next is 904,631. There are only 17 Perrin pseudoprimes less than a billion. By comparison, there are 50,847,534 primes less than a billion. So if you used the Perrin condition to test whether numbers less than a billion are prime, you would correctly identify all 50,847,534 primes as primes. But out of the 949,152,466 composite numbers, you would falsely report 17 of these as prime. In other words, you would be 100% accurate in identifying primes as primes, but only 99.999998% accurate in identifying composite numbers as composite. RELATED POSTS: Oscillating Fibonacci ratios Probability that a number is prime
There's no elementary formula for the circumference of an ellipse, but there is an elementary approximation that is extremely accurate. An ellipse has equation (_x_/_a_)² + (_y_/_b_)² = 1. If _a_ = _b_, the ellipse reduces to a circle and the circumference is simply 2π_a_. But the general formula for circumference requires the hypergeometric function 2_F_1: where λ = (_a_ - _b_)/(_a_ + _b_). However, if the ellipse is anywhere near circular, the following approximation due to Ramanujan is extremely good: To quantify what we mean by extremely good, the error is O(λ10). When an ellipse is roughly circular, λ is fairly small, and the error is on the order of λ to the 10th power. To illustrate the accuracy of the approximation, I tried the formula out on some planets. The error increases with ellipticity, so I took the most most elliptical orbit of a planet or object formerly known as a planet. That distinction belongs to Pluto, in which case λ = 0.016. If Pluto's orbit were exactly elliptical, you could use Ramanujan's approximation to find the circumference of its orbit with an error less than one micrometer. Next I tried it on something with a much more elliptical orbit: Halley's comet. Its orbit is nearly four times longer than it is wide. For Halley's comet, λ = 0.59 and Ramanujan's approximation agrees with the exact result to seven significant figures. The exact result is 11,464,319,022 km and the approximation is 11,464,316,437 km. Here's a video showing how elliptical the comet's orbit is. If you'd like to experiment with the approximation, here's some Python code: from scipy import pi, sqrt from scipy.special import hyp2f1 def exact(a, b): t = ((a-b)/(a+b))**2 return pi*(a+b)*hyp2f1(-0.5, -0.5, 1, t) def approx(a, b): t = ((a-b)/(a+b))**2 return pi*(a+b)*(1 + 3*t/(10 + sqrt(4 - 3*t))) # Semimajor and semiminor axes for Halleys comet orbit a = 2.667950e9 # km b = 6.782819e8 # km print exact(a, b) print approx(a, b)
I was playing around with SymPy, a symbolic math package for Python, and ran across
nsimplify. It takes a floating point number and tries to simplify it: as a fraction with a small denominator, square root of a small integer, an expression involving famous constants, etc. For example, suppose some calculation returned 4.242640687119286 and you suspect there's something special about that number. Here's how you might test where it came from. >>> from sympy import * >>> nsimplify(4.242640687119286) 3*sqrt(2) Maybe you do a calculation numerically, find a simple expression for the result, and that suggests an analytical solution. I think a more common application of
nsimplifymight be to help you remember half-forgotten formulas. For example, maybe you're rusty on your trig identities, but you remember that cos(π/6) is something special. >>> nsimplify(cos(pi/6)) sqrt(3)/2 Or to take a more advanced example, suppose that you vaguely remember that the gamma function takes on recognizable values at half integer values, but you don't quite remember how. Maybe something involving π or _e_. You can suggest that
nsimplifyinclude expressions with π and _e_ in its search. >>> nsimplify(gamma(3.5), constants=[pi, E]) 15*sqrt(pi)/8 You can also give
nsimplifya tolerance, asking it to find a simple representation within a neighborhood of the number. For example, here's a way to find approximations to π. >>> nsimplify(pi, tolerance=1e-5) 355/113 With a wider tolerance, it will return a simpler approximation. >>> nsimplify(pi, tolerance=1e-2) 22/7 Finally, here's higher precision approximation to π that isn't exactly simple: >>> nsimplify(pi, tolerance=1e-7) exp(141/895 + sqrt(780631)/895)
My previous post looked at rolling 5 six-sided dice as an approximation of a normal distribution. If you wanted a better approximation, you could roll dice with more sides, or you could roll more dice. Which helps more? Whether you double the number of sides per die or double the number of dice, you have the same total number of spots possible. But which approach helps more? Here's a plot. We start with 5 six-sided dice and either double the number of sides per die (the blue dots) or double the number of dice (the green triangles). When the number of sides _n_ gets big, it's easier to think of a spinner with _n_ equally likely stopping points than an _n_-sided die. At first, increasing the number of sides per die reduces the maximum error more than the same increase in the number of dice. But after doubling six times, i.e. increasing by a factor of 64, both approaches have the same error. But further increasing the number of sides per die makes little difference, while continuing to increase the number of dice decreases the error. The long-term advantage goes to increasing the number of dice. By the central limit theorem, the error will approach zero as the number of dice increases. But with a fixed number of dice, increasing the number of sides only makes each die a better approximation to a uniform distribution. In the limit, your sum approximates a normal distribution no better or worse than the sum of five uniform distributions. But in the near term, increasing the number of sides helps more than adding more dice. The central limit theorem may guide you to the right answer eventually, but it might mislead you at first.
A handful of dice can make a decent normal random number generator, good enough for classroom demonstrations. I wrote about this a while ago. My original post included Mathematica code for calculating how close to normal the distribution of the sum of the dice is. Here I'd like to redo the code in Python to show how to do the same calculations using SymPy. [UPDATE: Ill also give a solution that does not use SymPy and that scales much better.] If you roll five dice and add up the spots, the probability of getting a sum of _k_ is the coefficient of xk in the expansion of (_x_ + _x_2 + _x_3 + _x_4 + _x_5 + _x_6)5 / 65. Here's code to find the probabilities by expanding the polynomial and taking coefficients. from sympy import Symbol sides = 6 dice = 5 rolls = range( dice*sides + 1 ) # Tell SymPy that we want to use x as a symbol, not a number x = Symbol(x) # p(x) = (x + x^2 + ... + x^m)^n # where m = number of sides per die # and n = number of dice p = sum([x**i for i in range(1, sides + 1)])**dice # Extract the coefficients of p(x) and divide by sides**dice pmf = [sides**(-dice) * p.expand().coeff(x, i) for i in rolls] If you'd like to compare the CDF of the dice sum to a normal CDF you could add this. from scipy import array, sqrt from scipy.stats import norm cdf = array(pmf).cumsum() # Normal CDF for comparison mean = 0.5*(sides + 1)*dice variance = dice*(sides**2 -1)/12.0 temp = [norm.cdf(i, mean, sqrt(variance)) for i in roles] norm_cdf = array(temp) diff = abs(cdf - norm_cdf) # Print the maximum error and where it occurs print diff.max(), diff.argmax() QUESTION: Now suppose you want a better approximation to a normal distribution. Would it be better to increase the number of dice or the number of sides per dice? For example, would you be better off with 10 six-sided dice or 5 twelve-sided dice? Think about it before reading the solution. UPDATE: The SymPy code does not scale well. When I tried the code with 50 six-sided dice, it ran out of memory. Based on Andre's comment, I rewrote the code using
polypow. SymPy offers much more symbolic calculation functionality than NumPy, but in this case NumPy contains all we need. It is much faster and it doesn't run out of memory. from numpy.polynomial.polynomial import polypow from numpy import ones sides = 6 dice = 100 # Create an array of polynomial coefficients for # x + x^2 + ... + x^sides p = ones(sides + 1) p = 0 # Extract the coefficients of p(x)**dice and divide by sides**dice pmf = sides**(-dice) * polypow(p, dice) cdf = pmf.cumsum() That solution works for up to 398 dice. What's up with that? With 399 dice, the largest polynomial coefficient overflows. If we divide by the number of dice _before_ raising the polynomial to the power
dice, the code becomes a little simpler and scales further. p = ones(sides + 1) p = 0 p /= sides pmf = polypow(p, dice) cdf = pmf.cumsum() I tried this last approach on 10,000 dice with no problem.
This is my 2,000th blog post. I've been blogging for a little over five years, writing around a post a day. Thank you all for reading, commenting, sharing, and generally being so encouraging. This post will just be a few brief updates. BLOG UPGRADE I upgraded my blogging software and changed the theme a few weeks ago. The new theme is supposed to be more mobile-friendly. There were a few little problems with the upgrade. I think I've fixed everything by now, though I'd still like to make a few changes here and there. If you see any problems or have any suggestions, please let me know. CONSULTING Going out on my own has been a blast. I've got a few projects going on now and more on the horizon. Really interesting work. As you'd expect if you've read this blog for a while, the work has been a mixture of math, stats, computing, and writing. I may be hiring a couple people part-time to help me out. GOOGLE READER If you've subscribed to this blog through Google Reader, remember that Google is going to turn it off July 1. There are many other RSS readers out there. I list a few options here. My impression is that a lot of the people are moving to Feedly or NewsBlur. TRAVEL I plan to visit Tuscaloosa, San Francisco, and Austin in the next few weeks. Let me know if you're in one of these areas and would like to get together.
The term _technical debt_ describes the accumulated effect of short term decisions in a software development process. In order to meet a deadline, for example, a project will take shortcuts, developing code in a way that's not best for future maintainability but that saves time immediately. Once the pressure is off, hopefully, the team goes back and repays the technical debt by refactoring. I'd like to propose _hacking debt_ to describe a person who has been focused on "real work" for so long that he or she hasn't spent enough time playing around, making useless stuff for the fun of it. Some portion of a career should be devoted to hacking. Not 100%, but not 0% either. Without some time spent exploring and having fun, people become less effective and eventually burn out. RELATED POSTS: Just-in-case versus just-in-time Bicycle skills Playful and purposeful, pure and applied For hacking financial debt, see this.
Sometimes you can approximate a binomial distribution with a normal distribution. Under the right conditions, a Binomial(_n_, _p_) has approximately the distribution of a normal with the same mean and variance, i.e. mean _np_ and variance _np_(1-_p_). The approximation works best when _n_ is large and _p_ is near 1/2. This afternoon I was reading a paper that used a normal approximation to a binomial when _n_ was around 10 and _p_ around 0.001. The relative error was enormous. The paper used the approximation to find an analytical expression for something else and the error propagated. A COMMON RULE OF THUMB is that the normal approximation works well when _np_ > 5 and _n_(1-_p_) > 5. This says that the closer _p_ is to 0 or 1, the larger _n_ needs to be. In this case _p_ was very small, but _n_ was not large enough to compensate since _np_ was on the order of 0.01, far less than 5. ANOTHER RULE OF THUMB is that normal approximations in general hold well near the center of the distribution but not in the tails. In particular the _relative_ error in the tails can be unbounded. This paper was looking out toward the tails, and relative error mattered. For more details, see these notes on the normal approximation to the binomial.
Electrical engineers use _j_ for the square root of -1 while nearly everyone else uses _i_. The usual explanation is that EE's do this because they use _i_ for current. But here's one advantage to using _j_ that has nothing to do with electrical engineering. The symbols I, J, and K are used for unit vectors in the directions of the _x_, _y_, and _z_ axes respectively. That means that "i" has two different meanings in the real plane, depending on whether you think of it as the vector space spanned by I and J or as complex numbers. But if you use _j_ to represent the imaginary unit, its meaning does not change. Either way it points along the _y_ axis. Said another way, bold face I and italic _i_ point in different directions But bold face J and italic _j_ both point in the same direction. Here's what moving from vectors to complex numbers looks like in math notation: And here's what it looks like in electrical engineering notation: I don't expect math notation to change, nor would I want it to. I'm happy with _i_. But using _j_ might make moving between vectors and complex numbers a little easier. *** By the way, I use _j_ on @DSP_fact. DSP came out of electrical engineering, so _j_ is conventional there. I also use _j_ in Python because that's what the language requires.
From Edwin Land, inventor of the Polaroid camera:
… applied science, purposeful and determined, and pure science, playful and freely curious, continuously support and stimulate each other. The great nation of the future will be the one which protects the freedom of pure science as much as it encourages applied science.
I like sharing quotes on Twitter. Occasionally a quote will provoke an angry reaction, not to the content of the quote but to the source. Sometimes people will even acknowledge that they agree with the quote, but are dismayed that I would quote such a despicable person. This morning I was reading Norman Geisler's book on Thomas Aquinas and these lines reminded me of the brouhaha over quotes and sources.
No, I do not agree with everything he [Aquinas] ever wrote. On the other hand, neither do I agree with everything _I_ ever wrote.I'd say along with Geisler that if I could only quote people I completely agreed with, I could not even quote myself. Geisler goes on to say
But seven hundred years from now no one will even recognize my name, while Aquinas' works will still be used with great profit.I feel the same way about many of the people I quote. I remember catching flak for quoting Martin Luther. I've already forgotten the critic's name, and he's probably forgotten mine, but people will still be reading Luther in another five hundred years.
I needed to compute the higher moments of a mixture distribution for a project I'm working on. I'm writing up the code here in case anyone else finds this useful. (And in case I'll find it useful in the future.) I'll include the central moments first. From there it's easy to compute skewness and kurtosis. Suppose _X_ is a mixture of _n_ random variables _X__i_ with weights _w__i_, non-negative numbers adding to 1. Then the _j_th central moment of _X_ is given by where μ_i_ is the mean of _X__i_. In my particular application, I'm interested in a mixture of normals and so the code below computes the moments for a mixture of normals. It could easily be modified for other distributions. from scipy.misc import factorialk, comb def mixture_central_moment(mixture, moment): Compute the higher moments of a mixture of normal rvs. mixture is a list of (mu, sigma, weight) triples. moment is the central moment to compute. mix_mean = sum( [w*m for (m, s, w) in mixture] ) mixture_moment = 0.0 for triple in mixture: mu, sigma, weight = triple for k in range(moment+1): prod = comb(moment, k) * (mu-mix_mean)**(moment-k) prod *= weight*normal_central_moment(sigma, k) mixture_moment += prod return mixture_moment def normal_central_moment(sigma, moment): Central moments of a normal distribution if moment % 2 == 1: return 0.0 else: # If Z is a std normal and n is even, E(Z^n) == (n-1)!! # So E (sigma Z)^n = sigma^n (n-1)!! return sigma**moment * factorialk(moment-1, 2) Once we have code for central moments, it's simple to add code for computing skewness and kurtosis. def mixture_skew(mixture): variance = mixture_central_moment(mixture, 2) third = mixture_central_moment(mixture, 3) return third / variance**(1.5) def mixture_kurtosis(mixture): variance = mixture_central_moment(mixture, 2) fourth = mixture_central_moment(mixture, 4) return fourth / variance**2 - 3.0 Here's an example of how the code might be used. # Test on a mixture of 30% Normal(-2, 1) and 70% Normal(1, 3) mixture = [(-2, 1, 0.3), (1, 3, 0.7)] print "Skewness = ", mixture_skew(mixture) print "Kurtosis = ", mixture_kurtosis(mixture) RELATED POST: General formula for normal moments
I've started two new Twitter accounts this week: @DSP_FACT and @MUSICTHEORYTIP. DSP_fact is for DSP, digital signal processing: filters, Fourier analysis, convolution, sampling, wavelets, etc. MusicTheoryTip is for basic music theory with a little bias toward jazz. It'll tweet about harmony, scales, tuning, notation, etc. Here's a full list of my 15 daily tip twitter accounts. If you're interested in one of these accounts but don't use Twitter, you can subscribe to a Twitter account via RSS just as you'd subscribe to a blog. If you're using Google Reader to subscribe to RSS feeds, you'll need to switch to something else by July 1. Here are 18 alternatives.
SIAM News arrived this afternoon and had an interesting story on the front page: Applying math to myth helps separate fact from fiction. In a nutshell, the authors hope to get some insight into whether a myth is based on fact by seeing whether the social network of characters in the myth looks more like a real social network or like the social network in a work of deliberate fiction. For instance, the social networks of the Iliad and Beowulf look more like actual social networks than does the social network of Harry Potter. Real social networks follow a power law distribution more closely than do social networks in works of fiction. This could be interesting. For example, the article points out that some scholars believe Beowulf has a basis in historical events, though they don't believe that Beowulf the character corresponds to a historical person. The network approach lends support to this position: the Beowulf social network looks more realistic when Beowulf himself is removed. It seems however that an accurate historical account might have a suspicious social network, not because the events in it were made up but because they were filtered according to what the historian thought was important.
Unicode is supported everywhere, but font support for Unicode characters is sparse. When you use any slightly uncommon character, you have no guarantee someone else will be able to see it. I'm starting a Twitter account @MusicTheoryTip and so I wanted to know whether I could count on followers seeing music symbols. I asked whether people could see ♭ (flat, U+266D), ♮ (natural, U+266E), and ♯ (sharp, U+266F). Most people could see all three symbols, from desktop or phone, browser or Twitter app. However, several were unable to see the natural sign from an Android phone, whether using a browser or a Twitter app. One person said none of the symbols show up on his Blackberry. I also asked @diff_eq followers whether they could see the math symbols ∂ (partial, U+2202), Δ (Delta, U+0394), and ∇ (gradient, U+2207). One person said he couldn't see the gradient symbol, but the rest of the feedback was positive. So what characters can you count on nearly everyone being able to see? To answer this question, I looked at the characters in the intersection of several common fonts: Verdana, Georgia, Times New Roman, Arial, Courier New, and Droid Sans. My thought was that this would make a very conservative set of characters. There are 585 characters supported by all the fonts listed above. Most of the characters with code points up to U+01FF are included. This range includes the code blocks for Basic Latin, Latin-1 Supplement, Latin Extended-A, and some of Latin Extended-B. The rest of the characters in the intersection are Greek and Cyrillic letters and a few scattered symbols. Flat, natural, sharp, and gradient didn't make the cut. There are a dozen math symbols included: 0x2202 ∂ 0x2206 ∆ 0x220F ∏ 0x2211 ∑ 0x2212 − 0x221A √ 0x221E ∞ 0x222B ∫ 0x2248 ≈ 0x2260 ≠ 0x2264 ≤ 0x2265 ≥ Interestingly, even in such a conservative set of characters, there are a three characters included for semantic distinction: the minus sign (i.e. not a hyphen), the difference operator (i.e. not the Greek letter Delta), and the summation operator (i.e. not the Greek letter Sigma). And in case you're interested, here's the complete list of the Unicode characters in the intersection of the fonts listed here. (UPDATE: Added notes to indicate the start of a new code block and listed some of the isolated characters.) 0x0009 Basic Latin 0x000d 0x0020 - 0x007e 0x00a0 - 0x017f Latin-1 supplement 0x0192 0x01fa - 0x01ff 0x0218 - 0x0219 0x02c6 - 0x02c7 0x02c9 0x02d8 - 0x02dd 0x0300 - 0x0301 0x0384 - 0x038a Greek and Coptic 0x038c 0x038e - 0x03a1 0x03a3 - 0x03ce 0x0401 - 0x040c 0x040e - 0x044f Cyrillic 0x0451 - 0x045c 0x045e - 0x045f 0x0490 - 0x0491 0x1e80 - 0x1e85 Latin extended additional 0x1ef2 - 0x1ef3 0x200c - 0x200f General punctuation 0x2013 - 0x2015 0x2017 - 0x201e 0x2020 - 0x2022 0x2026 0x2028 - 0x202e 0x2030 0x2032 - 0x2033 0x2039 - 0x203a 0x203c 0x2044 0x206a - 0x206f 0x207f 0x20a3 - 0x20a4 Currency symbols ₣ ₤ 0x20a7 ₧ 0x20ac € 0x2105 Letterlike symbols ℅ 0x2116 № 0x2122 ™ 0x2126 Ω 0x212e ℮ 0x215b - 0x215e ⅛ ⅜ ⅝ ⅞ 0x2202 Mathematical operators ∂ 0x2206 ∆ 0x220f ∏ 0x2211 - 0x2212 ∑ − 0x221a √ 0x221e ∞ 0x222b ∫ 0x2248 ≈ 0x2260 ≠ 0x2264 - 0x2265 ≤ ≥ 0x25ca Box drawing ◊ 0xfb01 - 0xfb02 Alphabetic presentation forms ﬁ ﬂ