Your verification ID is: guDlT7MCuIOFFHSbB3jPFN5QLaQ Big Computing: March 2012

Friday, March 30, 2012

1940 US Census Raw Data to be Released on April 2nd - Analytics Geek rejoice!

Yes there is nothing better for an analytics junkie than to get access to a big data set that has the added benefit of the raw data being available in digital format. So it is with the release of the 1940 United States Federal Census Data to be released on this coming Monday. This data is not anonymized and contains the name, age, sex and location of all 130 million Americans in addition 5% of respondents were asked supplemental questions that are also contained in this data. This should be a fun Open Data Set to play with.

Upon its release, the 1940 U.S. Census Community Project, a joint initiative between Archives.com, FamilySearch, findmypast.com, and other leading genealogy organizations, will coordinate efforts to provide quick access to these digital images and immediately start indexing these records to make them searchable online with free and open access. They are looking for volunteers to help in this effort.

So Hack away and enjoy this new data set.

Where is the sustianable Sushi in California?

I went out to California to visit my parents over the spring break. On that trip we went to the popular local sushi restaurant. The place was called Okura. Apparently this is the place where all the  Professional Tennis Players go to eat during the BNP Open in La Quinta.

I figured that a sushi place in Southern California would be progressive and cutting edge. I was wrong. There was no evidence of any effort to be environmentally conscious. The chopsticks were disposable and the seafood was unsustainable. Being a regular of Miya's in New Haven I had not seen this type of seafood in years. I found I was not missing anything. The flavors were bland, and the food lacked creativity. Okura served the old standards of rolls with Tuna, Salmon and shrimp. I found these could not stand up against Miya's Lion Fish, Scup and Asian Shore Crabs.

With all that has happened lately in our understanding of what we can and should do to eat seafood in a sustainable way it is time for all to follow the direction led by Miya's sushi. Sustainable Seafood is now mainstream with Grocery Store like Wegman's and Whole Foods leading the way.

Thursday, March 29, 2012

Murray Lender passes away... thanks for every thing.

Last week Murray Lender of Lender's Bagels fame passed away. I had not thought of him for years. I one met him a few times, but he and his family did some little and big things that really made growing up in New Haven a great experience.

I went to school with a number of the Lender family kids. Every week a truck from the Lenders' Bakery would drop off some fresh bagel to be given out for free at snack time. They were great. To this day when I think of getting a snack the first thing I think of is getting a bagel because not only do I like bagels but it bring me back to the pleasant memories of my youth.

In the 80s the Lenders opened a restaurant in Hamden that served all food that could include bagels. There were bagel pizzas and bagel burger. That was the first time I ever saw a bagel pizza, and I still see them whenever I go to the store in the freezer section. The bagel burgers are still my favorite burger of all time. The is simply nothing better than a onion bagel burger. It is awesome!

After they sold Lenders Bagels to Kraft the Lender family expanded their support of the New Haven Community and the schools their kids went to. I enjoyed the result of their generosity when I was young and now my daughter is using those same facilities today.

I would say Murray Lender will be missed, but I see him every day in New Haven at the JCC, Hamden Hall and Foote.

Tuesday, March 13, 2012

March 14 is International Pi day

Yes, it is that time of year when all practicing Geeks gather around the round table to celebrate that most sacred of days to honor the mighty Pi. Not the Apple Pie of All-American fame or the Cherry Pie we all craved in our teenaged yours, but Pi the ratio of a Euclidean circle to its diameter or roughly 3.14.

Although I have always been partial to the worship of Avagodros Number as a better representation of the dimensionless representation of matter, it is the humble Pi the has attracted the most ardent followers in recent years. Just like most religions, numerists have placed their holidays on top or near popular pagan holidays of the past. Passover and Easter come up yearly around the time of the ancient spring festival. So it is that Pi Day comes to us every spring to mark the exit from mathematical ignorance that started with the Greeks. Originally Pi was know as Archimedes Constant, but since you can traditionally only have one thing name after you and there was the Archimedes Screw ( I love that name!) Pi became known as Pi.

Pi day is celebrated just like any other religious holiday. We eat traditional foods ( round).

We watch movies about the thing we worship.

Not exactly Santa Claus is Coming to Town, but not bad. There is also a website for Pi Day and a facebook Page. So kick back eat a pizza at a round table and throw a ball around this Pi Day. Thanks to Jared Lander for introducing me to this most special of days.

Monday, March 12, 2012

Predictive Analytics for March Madness 2012

For a couple of years now Danny Tarlow and Commisioner Lee have hosted a Predictive Analytics competition for March Madness. It even got some press last year:"


Software to predict 'March Madness' basketball winner

MacGregor Campbell, consultant
BasketBall.jpg (Image: Jonathan Daniel/Getty)
Fine, computers, you can beat us at chess and Jeopardy!, just please let us keep March Madness. With the US National Collegiate Athletic Association's basketball tournament starting today, contestants in the second annual March Madness Predictive Analytics Challenge are attempting to build software that can pick winning teams better than humans.
The contest pits machine against machine to find out which algorithm can correctly predict the outcome of the 64-team contest. Tournament brackets must be chosen entirely by computer algorithm, and no specific team-based rules, such as "always pick Duke over North Carolina", are allowed. All contestants are restricted to using the same data set - team and player statistics from the 2006 season until last month.
Contest organiser Danny Tarlow's own entry started out as a movie recommendation engine similar to those used on sites like Netflix. He says that predicting what movie a particular person would like to see is similar to predicting how well a basketball team's attack will do against their opponent's defence: both interactions are driven by unknown rules.
To predict the result of a basketball game, his algorithm chews through loads of regular season data and uses probabilities to find equations that fit the outcomes of each game. It then uses these equations to pick which teams will win in tournament match-ups. "The algorithm knows nothing about basketball or details about any team. It just sees the outcome of each game in the season, and it tries to discover latent characteristics that best explain the outcomes," he says.
Other entries range from using genetic algorithms to evolve equations that can pick winners to more straightforward attempts to boil down a team's strengths and weaknesses to a single number, then pick the team with the higher number in each match-up.
Last year's contest had 10 entries, including a "pace" bracket that simply picked the higher-seeded team in each matchup. Six of the entries did better than this baseline, one even predicting underdog Butler University's surprising ascent to the final four.
Tarlow hopes for a better performance this year, but is well aware of the difficulty of predicting the outcome of an entire basketball tournament. "There's clearly a lot of luck that goes into having a successful bracket."
We'll know how the software programs fare soon - the round of 64 begins today."

I often think that the world of predictive analytic competitions is made up solely of Kaggle competitions, but there are lots of others out there.These two guys have run a good contest for a while now so I encourage everyone to give it a try especially if you are an R user instead of a Python guy.

I played with some models to do this but none of them where ever outstanding.  One I liked had a factor for streaky teams. I found that teams who had long runs of multiple wins tended to do better than those teams with similar records who did not. When  I further tuned this with weighting for things like streaks later in the season and level of competition it seemed to do better than anything else I tried. If you have the time don't just fill out a bracket predict one.

Thursday, March 1, 2012

Playing Golf with President Clinton in Palm Springs - Never pass up a special experience

My father has played in the Bob Hope since the 1960s. He has always enjoyed the event, but this year he thought he would pass on it because he was too busy. My father is 70 and has been retired for over a decade. When he told me of his plans to pass on this tournament I told him I thought he was crazy. The Hope is a unique and special opportunity that I would be grateful to experience just one in my life. Most people never get do something cool and special like this in their entire lives. I asked him to reconsider and play in the tournament.

To my great pleasure he did reconsider and joined the field of amateurs at the Hope. The result was unbelievable.

The Tournament is no longer officially called the Bob Hope Desert Classic, but now goes by the name Humana Challenge with the Clinton Foundation as its lead sponsor. It is a four day tournament where amateurs are teamed up with Pros for three of the days and play on three different courses.  My father's draw was in the celebrity field so he was on television and got to talk with many of the celebrities who played in the tournament. He also got to play with some well know PGA Pros. On the first day he played with Phil Mickelson. On the second day he played with Richard E. Lee, and on the final day he was supposed to play with Bud Cauley.

That was not even the best part. On the first day President Clinton joined my father and Phil Mickelson on a few holes! That is why you never pass up on a unique experience. It could be even more unique than you could ever possibly imagine.