Project in Information-Theoretic Modeling : Details for problem 2
The data for Problem 2 consist of just one file: four-stocks.csv. There is no input, you just need to output a copy of this file without any additional side information available.
Your solution should be a single program that incorporates both Problem 1 and Problem 2. In other words, you may assume side information as for Problem 1, and your program should produce the output files for both problems in the same directory. Otherwise, practical procedures are the same as for Problem 1. The deadline is Tuesday, 13 November, at 9am.
The file four-stocks.csv
consists of 10122 lines of text. Each line consists of four numbers separated by commas. The numbers all have two decimals.
It may be helpful to know that the numbers in the file are stock prices from the New York Stock Exchange, in dollars and cents. The source of our data is infochimps. Each line contains for one day the closing prices of four stocks, namely Chevron (CVX), Hewlett-Packard (HPQ), International Business Machines (IBM) and Exxon Mobil (XOM), in that order. The data cover the period 1970–2010, and the lines are in chronological order. The file includes only days for which the price was listed for all the four stocks, so some days are missing (although you can't see this from the file because the date information has been omitted). If you think of the data as a 10122 by 4 matrix, plotting the columns of the matrix (i.e., the price development of the individual stocks) may help you to get a better idea of the data.