Data Mining : Problem #4
Problem description
Different animal species favor and, in the extreme, survive in different environments. Climate is one of the main environmental factors. For example, the areas where the European Elk lives tend to be characterized by relatively low temperatures and heavy rainfalls during certain months. Additionally, species that are related are likely to survive in the same or similar environments, such as Reindeer and European Elk.
We have data about European climate and mammals species:
- A table where rows correspond to geographic areas and columns contain information about mammal species as well as the climate in each region. Some of the data is binary (presence of species in the region) whereas some attributes are real-valued (climatic characteristics of the region).
- Additionally, the species taxonomy is available (a hierarchy of species).
Which animal species are associated with which climatic conditions?
Hints
The new thing to be leaned and used is frequent patterns with non-binary data:, especially categorical attributes and continuous attributes and their binarization/discretization, as well as hierarchies.