ticdata               package:kernlab               R Documentation

_T_h_e _I_n_s_u_r_a_n_c_e _C_o_m_p_a_n_y _D_a_t_a

_D_e_s_c_r_i_p_t_i_o_n:

     This data set used in the CoIL 2000 Challenge contains information
     on customers of an insurance company. The data consists of 86
     variables and includes product usage data and socio-demographic
     data derived from zip area codes. The data was collected to answer
     the following question: Can you predict who would be interested in
     buying a caravan insurance policy and give an explanation why ?

_U_s_a_g_e:

     data(ticdata)

_F_o_r_m_a_t:

     ticdata: Dataset to train and validate prediction models and build
     a description (9822 customer records). Each record consists of 86
     attributes, containing sociodemographic data (attribute 1-43) and
     product ownership (attributes 44-86). The sociodemographic data is
     derived from zip codes. All customers living in areas with the
     same zip code have the same sociodemographic attributes. Attribute
     86, 'CARAVAN:Number of mobile home policies', is the target
     variable.

     Data Format

         1   'STYPE'     Customer Subtype
         2   'MAANTHUI'  Number of houses 1 - 10
         3   'MGEMOMV'   Avg size household 1 - 6
         4   'MGEMLEEF'  Average age
         5   'MOSHOOFD'  Customer main type
         6   'MGODRK'    Roman catholic
         7   'MGODPR'    Protestant ...
         8   'MGODOV'    Other religion
         9   'MGODGE'    No religion
         10  'MRELGE'    Married
         11  'MRELSA'    Living together
         12  'MRELOV'    Other relation
         13  'MFALLEEN'  Singles
         14  'MFGEKIND'  Household without children
         15  'MFWEKIND'  Household with children
         16  'MOPLHOOG'  High level education
         17  'MOPLMIDD'  Medium level education
         18  'MOPLLAAG'  Lower level education
         19  'MBERHOOG'  High status
         20  'MBERZELF'  Entrepreneur
         21  'MBERBOER'  Farmer
         22  'MBERMIDD'  Middle management
         23  'MBERARBG'  Skilled labourers
         24  'MBERARBO'  Unskilled labourers
         25  'MSKA'      Social class A
         26  'MSKB1'     Social class B1
         27  'MSKB2'     Social class B2
         28  'MSKC'      Social class C
         29  'MSKD'      Social class D
         30  'MHHUUR'    Rented house
         31  'MHKOOP'    Home owners
         32  'MAUT1'     1 car
         33  'MAUT2'     2 cars
         34  'MAUT0'     No car
         35  'MZFONDS'   National Health Service
         36  'MZPART'    Private health insurance
         37  'MINKM30'   Income  >30.000
         38  'MINK3045'  Income 30-45.000
         39  'MINK4575'  Income 45-75.000
         40  'MINK7512'  Income 75-122.000
         41  'MINK123M'  Income <123.000
         42  'MINKGEM'   Average income
         43  'MKOOPKLA'  Purchasing power class
         44  'PWAPART'   Contribution private third party insurance
         45  'PWABEDR'   Contribution third party insurance (firms)
         46  'PWALAND'   Contribution third party insurane (agriculture)
         47  'PPERSAUT'  Contribution car policies
         48  'PBESAUT'   Contribution delivery van policies
         49  'PMOTSCO'   Contribution motorcycle/scooter policies
         50  'PVRAAUT'   Contribution lorry policies
         51  'PAANHANG'  Contribution trailer policies
         52  'PTRACTOR'  Contribution tractor policies
         53  'PWERKT'    Contribution agricultural machines policies
         54  'PBROM'     Contribution moped policies
         55  'PLEVEN'    Contribution life insurances
         56  'PPERSONG'  Contribution private accident insurance policies
         57  'PGEZONG'   Contribution family accidents insurance policies
         58  'PWAOREG'   Contribution disability insurance policies
         59  'PBRAND'    Contribution fire policies
         60  'PZEILPL'   Contribution surfboard policies
         61  'PPLEZIER'  Contribution boat policies
         62  'PFIETS'    Contribution bicycle policies
         63  'PINBOED'   Contribution property insurance policies
         64  'PBYSTAND'  Contribution social security insurance policies
         65  'AWAPART'   Number of private third party insurance 1 - 12
         66  'AWABEDR'   Number of third party insurance (firms) ...
         67  'AWALAND'   Number of third party insurane (agriculture)
         68  'APERSAUT'  Number of car policies
         69  'ABESAUT'   Number of delivery van policies
         70  'AMOTSCO'   Number of motorcycle/scooter policies
         71  'AVRAAUT'   Number of lorry policies
         72  'AAANHANG'  Number of trailer policies
         73  'ATRACTOR'  Number of tractor policies
         74  'AWERKT'    Number of agricultural machines policies
         75  'ABROM'     Number of moped policies
         76  'ALEVEN'    Number of life insurances
         77  'APERSONG'  Number of private accident insurance policies
         78  'AGEZONG'   Number of family accidents insurance policies
         79  'AWAOREG'   Number of disability insurance policies
         80  'ABRAND'    Number of fire policies
         81  'AZEILPL'   Number of surfboard policies
         82  'APLEZIER'  Number of boat policies
         83  'AFIETS'    Number of bicycle policies
         84  'AINBOED'   Number of property insurance policies
         85  'ABYSTAND'  Number of social security insurance policies
         86  'CARAVAN'   Number of mobile home policies 0 - 1

     Note: All the variables starting with M are zipcode variables.
     They give information on the distribution of that variable, e.g.
     Rented house, in the zipcode area of the customer.

_D_e_t_a_i_l_s:

     Information about the insurance company customers consists of 86
     variables and includes product usage data and socio-demographic
     data derived from zip area codes. The data was supplied by the
     Dutch data mining company Sentient Machine Research and is based
     on a real world business problem. The training set contains over
     5000 descriptions of customers, including the information of
     whether or not they have a caravan insurance policy. The test set
     contains 4000 customers. The test and data set are merged in the
     ticdata set. More information about the data set and the CoIL 2000
     Challenge along with publications based on the data set can be
     found at <URL: http://www.liacs.nl/~putten/library/cc2000/>.

_S_o_u_r_c_e:

        *  UCI KDD Archive:<URL: http://kdd.ics.uci.edu> 

        *  Donor:  Sentient Machine Research 
            Peter van der Putten 
            Sentient Machine Research 
            Baarsjesweg 224 
            1058 AA Amsterdam 
            The Netherlands 
            +31 20 6186927 
            pvdputten@hotmail.com, putten@liacs.nl

_R_e_f_e_r_e_n_c_e_s:

     Peter van der Putten, Michel de Ruiter, Maarten van Someren  _CoIL
     Challenge 2000 Tasks and Results: Predicting and Explaining
     Caravan Policy Ownership_
      <URL: http://www.liacs.nl/~putten/library/cc2000/>

