Financial Services is a big user of Big Data, and innovator too. One example is mortgage bond trading. To answer your questions for it:
What kinda data these companies used. What was the size of the data?
- Long histories of each mortgage issued for the past many years, and payments by month against them. (Billions of rows)
- Long histories of credit histories. (Billions of rows)
- Home price indices. (Not as big)
What kinda of tools technologies they used to process the data?
It varies. Some use in-house solutions built on databases like Netezza or Teradata. Others access the data via systems provided by the data providers. (Corelogic, Experian, etc) Some banks use columnal database technologies like KDB, or 1010data.
What was the problem they were facing and how the insight they got the
data helped them to resolve the issue.
The key issue is determining when mortgage bonds (mortgage backed-securities) will prepay or default. This is especially important for bonds that lack the government guarantee. By digging into payment histories, credit files, and understanding the current value of the house, it's possible to predict the likelihood of a default. Adding an interest rate model and prepayment model also helps predict the likelihood of a prepayment.
How they selected the tool\technology to suit their need.
If the project is driven by internal IT, usually it's based off of a large database vendor like Oracle, Teradata or Netezza. If it's driven by the quants, then they are more likely to go straight to the data vendor, or a 3rd party "All in" system.
What kinda pattern they identified from the data & what kind of
patterns they were looking from the data.
Linking the data gives great insights into who is likely to default on their loans, and prepay them. When you aggregated the loans into bonds, it can be the difference between a bond issued at $100,000,000 being worth that amount, or as little as $20,000,000.