Thursday, July 31, 2008

Numbers Used to Quantify Writing Styles

At the end of May, Microsoft gave up in the race with Google to digitize all the world’s printed books and magazines. The motivation for these ambitious projects is to allow search engine technology to reach inside the pages of every book in print. These corporations envision a future with no need to actually thumb through a book at a store or library in search of information. Search engines will perform the task much faster and return the exact location inside of any book for any text or keyword a user is seeking.

Whether such a future is a good or bad one for writer and publishers remains to be seen. Publishers have been less than enthusiastic about these digitization projects because of fears of copyright infringement and potential loss of sales. My own view as a writer is that exposure is a good thing. Obscurity is a greater threat to a writer’s livelihood than copyright infringement.

The Microsoft explanation provided for the decision to abandon the project came in classic corporate-speak. On a Microsoft blog, Satya Nadella, Senior vice president for search portal and advertising, wrote:

“Given the evolution of the Web and our strategy, we believe the next generation of search is about the development of an underlying, sustainable business model for the search engine, consumer, and content partner. For example, this past Wednesday we announced our strategy to focus on verticals with high commercial intent, such as travel, and offer users cash back on their purchases from our advertisers.”

If I ever need to write a parody of a corporate memo these sentences are a good starting point. Just how many buzzwords—evolution, development, strategy, sustainable, verticals, model, etc—are packed into just these two sentences with 65 words of prose? Actually with digitized books, questions such as that can be answered. In fact Google and Microsoft have been playing catch up with Amazon’s “Search Inside the Book” program launched in 2004. Publishers are now encouraged to submit electronic copies of printed books for Amazon to digitize and make searchable.

But, Amazon’s “Search Inside The Book” program allows for more than just keyword searches. You can now look at statistics about the writing style of a book before you purchase it. I checked Amazon’s listing for my book—The Two Headed Quarter—and discovered all sorts of numerical facts that I did not know. My book has 672, 289 characters arranged into 99,104 words. As expected in a book about deceptive numbers, the word “number” appears frequently—567 times, but it is the second most frequently occurring word. I had no idea that the most frequently used word in my book, appearing 709 times, is “years.”

But most fascinating is the statistical summary of my writing style that Amazon provides. Potential buyers can view the “readability” and “complexity” analysis of the text. My book rates as highly readable with a Flesch-Kincaid index of 7.2, meaning that you only need a 7th grade education to read it. According to Amazon 77% of all the other books are harder to read and in the category of personal finance books, 93% are more difficult to read. The easy reading arises in large part because the complexity analysis reveals that I have a simple writing style averaging only 8.6 words per sentence.

Now that last number threw me. I opened my book to some random pages and I found myself hard pressed to find sentences as short as 8 to 9 words, let alone enough sentences less than 8 words that would allow 8.6 to be the average. Amazon’s number for my average words per sentence just didn’t seem reasonable to me. I decided to run my own check using Microsoft Word’s analysis tools on some chapters from the original manuscript. I had never done that kind of analysis on my writing before, which shows you how much I think about readability statistics when I’m writing. The results:

For Chapter 1:18.0 words per sentence, Flesch-Kincaid index —10.1
For Chapter 2: 17.2 words per sentence, Flesch-Kincaid index —10.7
For Chapter 5: 18.9 words per sentence, Flesch-Kincaid index —11.4
For Chapter 8: 19.3 words per sentence, Flesch-Kincaid index —10.8

This means the readability of my writing is consistently at a 10th to 11th grade level (not 7th grade) and about average in complexity (according to Rudolf Flesch, co-inventor of the Flesch-Kincaid scale, the average words per sentence found in reading material is 17).

So how can Amazon’s statistics be so skewed? I don’t know but I have a good idea. Again, my book is about numbers and there are many numbers in the book. There are figures and tables with numbers along with worked examples containing numbers that allow readers to figure out numerical answers to many common financial questions. Many of the numbers in the book contain decimal points. My hypothesis is that the software Amazon uses to perform readability analysis cannot tell the difference between a period and a decimal point. After all, there is no difference between the two characters. Only an actual reader who can interpret the meaning of a sentence knows that a decimal point inside a number does not terminate a sentence.

I find it ironic that a book about deceptive numbers has an Amazon listing with a bunch of deceptive numbers to characterize its readability. Another example of companies spending time and resources to compile and publish useless numerical information. By the way, according to Amazon’s site my book has 3348 words per ounce and if you buy it you will receive 4715 words for each dollar you spend.

Joseph Ganem is a physicist and author of the award-winning The Two Headed Quarter: How to See Through Deceptive Numbers and Save Money on Everything You Buy

Tuesday, July 15, 2008

Does showing up for work cost more than you earn?

The Sunday (July 13, 2008) Baltimore Sun this week had a front-page article about mothers: “Back to Work, Like it or Not” with the subtitle that “Women who left jobs for children find economy reverses the trade.” Reporter Jill Rosen writes “The soured economy—with its ever-increasing gas, food and utility prices, its sinking home values and its corporate downsizing—is forcing mothers who have traded careers for families to think about trading back.”

I can relate to the article because my wife will change from part-time to full-time work this fall. Energy and food prices have increased so much in the past year our budget no longer works as it once did. But our three children are teenagers and our current financial goal is getting them all through college. The Baltimore Sun article profiled women in much more difficult circumstances—mothers of toddlers and infants who planned on being stay-at-home moms. But, the article left me wondering if working for a paycheck outside the home is a financially viable option for many these women.

Women who can work professional salaried jobs will come out ahead financially by working outside the home. But women who work part-time or for lower wages might find that the same financial pressures forcing them outside the home might also make it impossible to realize any financial benefit.

How much money will these women have left after paying for commuting costs, daycare, work expenses, Social Security, and state and federal taxes? I’ve already seen television news reports about men with commutes so long it no longer pays to drive to work. I can imagine many common circumstances where stay-at-home moms would be hard pressed to increase the family income by working outside the home.

To assist families in figuring out how much additional income can be generated by working outside the home, I’ve teamed with my publisher to create a new calculator for the Compute Gas Savings Website. The calculator, at http://www.computegassavings.com/earningscalc.html, computes daily take home pay after subtracting all of the costs associated with showing up for work.

Consider a married mom in the suburbs with one child who finds a job in a city that pays $12 per hour. She must commute 25 miles each way in a family SUV that gets 20 miles per gallon. She manages to find daycare for $25 per day and finds herself spending an additional $5 per day on average for other work-related costs—coffee, snacks, clothes, car maintenance etc. Because she is married her additional income is not completely sheltered by deductions and exemptions. For this example we will assume a net federal tax rate of 10% and state tax rate of 3%. After entering all these numbers in the calculator it shows that she will take home just $36.17 per day out of her $96 per day net earnings or about $4.50 per hour.

Where did the money go? The combined cost of Social Security, federal, and state taxes takes one-quarter and the combined cost of daycare and gas takes one-third. That means it costs more than one half of her pay just to show up for work. This assumes she has only one child. Add the cost of daycare for a second child and daily earnings come to $11.17.

In fact it is easy to construct plausible scenarios where real earnings per day become negative. Keep the same tax rates, lower her hourly pay to $10, or $80 per day, give her two children so that daycare becomes $50 per day, make the distance to work 30 miles and the gas mileage 15 miles per gallon, and the real earnings per day becomes a negative $7.52. Going to work will cost her more than she makes.

I urge women looking to increase family income by working outside the home to test out different scenarios before deciding on a job. The hourly pay offered might not be the relevant number to compare when making a decision.

Joseph Ganem is a physicist and author of the award-winning The Two Headed Quarter: How to See Through Deceptive Numbers and Save Money on Everything You Buy

Thursday, July 3, 2008

Oil Prices Are Not Dependent on Oil Sources

A reader of my post “A Tax By Any Other Name” felt that I was na├»ve to suggest that supply and demand sets oil prices. The reader believes that oil pricing is driven more by speculation than anything else.

I agree with the reader that speculation is a large part of the reason we are paying $4 per gallon for gas at the moment. In my post I stated the supply and demand for oil is only a part of what sets the price for gas. I was mostly writing on how the increasing supply and decreasing demand for dollars is contributing to oil price increases. But in the last six months the falling dollar and increased oil consumption cannot account for the sudden 50% price rise for a barrel of oil. Speculation is clearly part of the problem.

It is no secret that the banks, investment houses, hedge funds, and oil companies now conduct most the trading in oil futures on all electronic foreign exchanges (such as the InterContinental Exchange or ICE) that are out of reach of U. S. regulators. A CBS news story quoted Michael Greenberger, a former top staffer at the Commodities Futures Trading Commission, as saying that 25% -50% of the price per barrel might be due to speculation. Greenberger believes that “If you can trade out of the sight of U.S. regulators, you can manipulate these markets."

What is especially disturbing about “foreign exchanges” like the ICE is that it is not actually foreign. Among its founding partners are Goldman Sachs and Morgan Stanley, its headquarters is in Atlanta, its primary data center is in Chicago, and it settles most trades in U. S. dollars. Despite all of its U. S. ties, the ICE claims that because its energy futures business is conducted in London—whatever that means—U. S. laws or regulations do not apply to it.

But all this supports the crux of my earlier argument that the energy independence that politicians speak of achieving is a myth because oil companies “will always sell their product to highest bidder, wherever the bidder resides.” The price of oil has no relationship to its origin.

In 2007 the U. S. imported about 10 million barrels of oil per day and produced about 5 million barrels per day. According to the Energy Information Administration, the top three countries of origin for imported oil in 2007 were Canada (1.85 million barrels per day), Mexico 1.47 million barrels per day) and Saudi Arabia (1.36 million barrels per day). Canada and Mexico actually supply more oil to the United States than Saudi Arabia.

If the U. S. decides to increase its domestic production of oil through offshore drilling or other means, it will contribute to the world supply of oil and that additional supply will be factored into the world price. Any politician who claims we can be more energy independent by drilling for more oil in the U. S. is being disingenuous. A far more relevant question to ask our political leadership is: Why are U. S. investment banks allowed to manipulate energy futures markets with no government oversight? I don’t think there is going to be a straight answer for that question.

Joseph Ganem is a physicist and author of the award-winning The Two Headed Quarter: How to See Through Deceptive Numbers and Save Money on Everything You Buy