This is a Team Project and the team leader is Mengshu Zhang. We won Top 23 in the Humana-Mays Healthcare Competition.

1. Executive Summary

The poem quoted above was written by a Chinese poet, Du Fu, around 1300 years ago, which vividly narrated his vision of all humankind living without housing insecurity back at the stage of his life. Today, the per capita GDP (2021) in the United States has reached $69,298, while the average housing cost in the United States has reached $428,700.(6)(7) However, the problem of housing insecurity still remains a serious issue in such a developed society.

This study aims to help Humana understand the current housing insecurity problem in the United States through advanced analytical methods. Specifically, we implement different models to identify members experiencing housing insecurity based on certain information. Based on the best predictive model, we examine different features that affect housing insecurity the most. Combined with the research papers, we would provide some suggestions regarding solving healthcare demands among members who face housing insecurity problems to Humana.

The overall objective is splitted into three parts:
Build a predictive model to identify the underlying housing insecure members
Employ important features from the model to categorize and analyze the reasons of housing insecurity
Make recommendations to target housing insecure members in different categories

Based on the model we construct, we analyze several important features contributing to house insecurity, including participant’s loan and income, disease and medication. Combined with these factors and existing studies, we come up with suggestions from both government and large healthcare organizations’ perspectives. We encouraged the government to spend more budget on improving public infrastructure , including public hospitals and public transportation systems, and constructing more public housing. For large healthcare organizations, we suggest they fully utilize the dataset to effectively find out the people facing housing insecurity and cooperate with the government to build a network platform. Large organizations can play a role as a dealer to match suitable resources with those in need and saving time and money for both sides.

2. Case Background

2.1 General Setting

Housing Insecurity means lack of stability in having personal accommodation. (2) According to a set of survey data in 2021, about 3.7 million people in the United States are facing housing insecurity problems. In addition to this, there are about 7.7 million adults behind on rent.(4) Another set of figures said one in five households in the United States indicated they simply lacked confidence to make their next rent or mortgage payment.(5) Although the homeless factor is not considered, these data all show that for the United States, housing insecurity is a serious social issue. According to Humana, 60% of health is impacted by our social factors, environmental conditions and lifestyle behaviors.

Therefore, we hope that by analyzing and using the existing data, we can find out the main factors that cause housing insecurity and some similarities among the people facing this problem. This will be not only helpful for Humana to decide what to do next upon $25 million investment in housing insecurity, but even more beneficial for the government, public organizations and researchers to outline and find out the people who may face this problem at present or in the future, and help them in a better and more timely manner.

At present, there have been many studies on the population of housing insecurity, and some valid data show that race is a more important factor, and Hispanic and Black are currently facing more serious problems. (5) Work is another important factor. Tenants who have lost their source of income are 2.7 times more likely to be housing insecure than those with stable jobs.(5) Local differences also exist, and housing insecurity is more common in several southern states, including Florida, Louisiana, Mississippi and Texas. (5) One of the difficulties in housing insecurity research is that it is difficult to define what housing insecurity is. The model of researchers at the University of southern California has several parameters to study, including housing instability, housing affordability, housing safety, housing quality, neighborhood safety, neighborhood quality and homelessness. (2) And the Department of Health and Human Services takes overcrowding into consideration. (2) The Center for Disease Control is another agency that often studies this issue, and their frequently asked question is: “How often in the past 12 months would you say you were worried or stressed about having enough money to pay your rent/ mortgage? Sometimes, usually, or always?” (2) The most authoritative institution in the United States to study housing insecurity is The American Housing Survey (AHS), and the housing insecurity they define includes problems that people may face in multiple dimensions, including affordability, safety, quality, insecurity, and loss of housing. (3) In their consideration, missing a rent or utility payment or receiving an eviction notice is a sign factor. (3) In another set of data, 37.1 million households spend 30% or more of their income on housing, and 83.5% of them have an average annual income of less than $15,000. 17.6 million households’ housing expenses account for 50% or more of their income.(1) These unreasonable consumption ratios indicate that they are facing housing insecurity problems.
As the third largest health insurance provider in the nation, Humana is committed to helping millions of medical and specialty members achieve their best health. They want to create a new kind of integrated care with the power to improve health and well-being and lower costs. In April 2022, Humana expanded its national commitment to affordable housing with an additional $25 million investment.(8) This shows Humana cared about housing stability for vulnerable populations, motivating us to find an effective way to portray house insecurity populations and come up with corresponding solutions.

Existed research will become the reference for our research. We hope to solve the problem of finding housing insecurity through data, so that we can better provide targeted solutions after finding out the common points of the population.

2.2 Business Problem Statement

The current business problem is to identify unobserved groups suffering home insecurity among the Humana MAPD members. Based on the provided data, the first data problem is to build a precise classification model based on the medical claims and condition feature, pharmacy claims features, demographics, and other features. The second problem is to identify the important features shared by home-insecure groups. The third problem is to suggest possible solutions for the existing problem and analyze the feasibility of these solutions.

2.3 Definition of Metrics

With the predictions and the real answer, we would be able to compare the performances among models. The metric used here is AUC-ROC score, which stands for Area Under the Curve of Receiver Characteristic Operator, because it not only aligns with our ultimate goal, but also measures the “the performance of the model at distinguishing between the positive and negative classes.” Simply said, the model with a higher score means that it tends to distinguish between all the positive and the negative class points more accurately.

3.Data Preparation

3.1 Data Overview(General Distribution)

From Hamana’s training data set, there are 48300 pieces of data as rows and 881 pieces of data as columns. 48300 row variables are participation data sources collected by Humana who may or may not experience the house insecurity. 880 column variables are features of these paticiations and 1 column is the hi_flag to determine if one is under house insecurity. In the hi_flag column, a value of “1” indicates a participation is experiencing the housing insecurity and a value of “0” indicates a paritcaition is currently out of the housing insecurity.

We notice that the data set is relatively organized, which means most variables can be directly interpreted. However, there still remain many variables that contain missing value and need data transformation for further analysis. Various data forms exist, including integer, scalar, binary categorical, and multi-class categorical.

We also notice that the dataset is highly unbalanced under whether a participant is experiencing house insecurity. There are 1582 participants featured with “1” and 34643 participants featured with “0”, this 21 times and more unproportional data set are naturally hard for applying machine learning to analyze, which will be discussed later.

The variables include features of participants from different perspectives for analysis. The majority of variables are about medical claims and pharmacy prescriptions. We manually classify them into different groups to see how data is distributed in figure 1. It also includes data of personal credit and income, regional economics and general information about sex and ID. The goal is to find out the influential players from them.

3.2 Data Cleaning

3.2.1 Dropping columns

We first dropped 211 columns which have non-distinct values. If each record contains the exact same information in a certain column, then the corresponding feature is not helpful for forecasting our target variable.
Then, we detected 260 columns that contained some null values, and finally dropped 12 columns with over 80% null values, such as “oms_risk_adj_payment_rate_b_amt,” “cms_tot_partd_payment_amt,” etc. After dropping columns, we still have 658 features in our dataset, but some of them will be eliminated later in the modeling section.

3.2.2 Missing Values Implementation

To deal with the remaining 248 columns with a few missing values, we filled in the data based on the data type of the column. For categorical variables, we replace missing values with the most frequent string in the column; for numerical values, we fill the missing values with median.

3.2.3 Feature Encoding

There are several features’ data types are misleading. And after converting data types, we encode the data in a better sense. There are several common ways to label, but we choose one-hot encoding to create dummy variables for all 7 categorical features because most of them are non-ordinal categorical data. After encoding all the features properly, the total number of columns is now 683.

3.3 Exploratory Data Analysis

We then perform fundamental exploratory analyses with the preliminary cleaned dataset, including each feature distribution.
Age Distribution: Most of the Humana members were people around 60-80 years old.

4. Modeling

4.1 Dimensional Reduction

Before actually building models, we realize that it is unlikely to analyze each feature thoroughly because of the numerosity of our columns. A common method would be principal component analysis (PCA) to combine features, even if we may fail to interpret a latent feature. However, new factors returned from PCA in our case are not satisfying because we fail to find a sharp decline of variance explained.
We, therefore, decide to drop some least important features by implementing mutual information and gini index. Nevertheless, almost all unimportant features we find are already excluded in a defaulted XGBoost model.
Therefore, we finally drop out all the features that get zero score, which means XGBoost does not consider them in any branch when building the boosting forest.

4.2 Model Selection

After excluding the least important features, we then split the cleaned “training data” into two subsets, the training set and the test set, with the percentages of 75% and 25% respectively. Purposely, we will be able to construct models based on the training set and implement the model on the test set to predict outputs (i.e., home insecurity flag). Since we have already known the actual result in the test set, we can therefore identify the potential issues in our model and improve the performance through some interventions.
As mentioned above, the target variable, “hi_flag,” is highly imbalanced in our dataset. Specifically in our training set, there are 34643 records without home insecurity issues and only 1582 people facing home insecurity issues. Such imbalanced classification introduces bias, which, in machine learning, will result in poor predictive performance on the minority class, i.e., correctly identifying home insecurity people. To solve this issue, one often considers stratification to resample the data. However, we believe that each row, regardless of the value of its target variable, contains some information that matters, so we decided to weigh the balance in our models to achieve better performance.
Next, we build three models that are built for solving classification problems, including logistic regression, random forest and XGBoost. At this stage, all of the hyperparameters, except the ones associated with imbalanced classification, are set in default in each of these models. And then, we used AUC-ROC score to rank each of the defaulted models as mentioned before. Based on our model with hyperparameters at default, the XGBoost has the best AUC-ROC score.

Model in K-folder Cross Validation AUC-ROC
Logistic 0.612596
Random Forest 0.634235
XGBoost 0.722239

4.3 Final Model by Tuning Hyperparameters

Based on the rankings, we decide to focus exclusively on the XGBoost model, which gives us the best AUC-ROC score with hyperparameters in default. XGBoost stands for “Extreme Gradient Boosting” and is an efficient implementation of the stochastic gradient boosting machine learning algorithm.
With the current XGBoost model, we introduce packages including “gridsearch,” “cv,” and “hyperopt” to tune hyperparameters. Here is a highlight of certain hyperparamters we choose.
scale_pos_weight = 21.8982. This is used to control the balance of positive and negative weights, and the value is decided by the class proportion of our target variable in the training set.
max_depth = 3. This hyperparameter refers to the maximum depth of a tree. Increasing this value will make the model more complex and more likely to overfit.
min_child_weight = 5. It defines the minimum sum of instance weight needed in a child. Increasing this number from the default value at 1 makes it relatively conservative.
gamma = 0.5117. Gamma is the minimum loss reduction required to make a further partition on a leaf node of the tree. We adjust it to 0.5117 from 0 which makes it more conservative properly.
colsample_bylevel = 0.6152, colsample_bynode = 0.8363, colsample_bytree = 0.8445. These hyperparameters are tuned in the same cross validation process in our project because these three are considered a family of parameters for subsampling of columns. It’s not guaranteed that each hyperparameter is optimized, but these three numbers combined return the best performance.
reg_alpha = 54.8129, reg_lambda = 2.1991. These two hyperparameters are L1 and L2 regularization terms on weights. In our case, the overall performance is more positively affected by optimizing reg_alpha comparatively.
With these hyperparameters, we improve our AUC-ROC score by about 8% from 0.66 to 0.7408386.

5. Feature Importance and Business Implications

5.1 Feature Importance of Final Model

​​​​As shown in the figure above are the top 20 important features. Their scores range from 0.007 to the highest 0.063. After careful analysis, we believe that these features with key influencing factors can be divided into three categories, one is related to the financial situation of partition, mainly reflects the characteristics of income and loans. The second category is related to the health of participation, which we call disease and medication. In this category, it can be divided into physical health and mental health, so we will focus on both behavioral diseases and non-behavioral diseases. The last category is others, we will include some complex and difficult to classify features, including age and demographic. After analyzing each feature that affects home insecurity, we will also try to come up with reasonable solutions and analyze feasibility.

5.2 Income and Loans

In the top 20 list, there are many features related to financial health, including ‘cms_low_income_ind’, ‘cons_ccip’, ‘cons_stlindex’, and ‘cons_stilnindx’. Among them, the most directly representative income-related feature is ‘cons_ccip’, It represents the percentile of the income of the partition in the census income. Unfortunately, comparing people known to be house insecurity versus people known not to be house insecurity, we cannot intuitively find the difference in median in their percentile data.

Further research on income, we think ‘cms_low_income_ind’ can have some indication. This feature is an indicator of a participation receiving a subsidy from CMS. We divided the data into two groups and plotted them out, respectively. The first group includes participants who are not under house insecurity and have not accepted the subsidy and participants who are not under house insecurity and have accepted the subsidy. The second group includes participants who are under house insecurity and have not accepted the subsidy and participants who are under house insecurity have accepted subsidy. Comparing people who are not house insecure and those who are house insecure, we can find that there is a significant difference in the ratio of whether to accept subsidies. Among the people who are not house insecure, only 15.6% accepted Subsidy from CMS, which we think is a proportion that represents normal acceptance of subsidy. And among those with house insecurity, 30.2% accepted Subsidy from CMS. (Figure below) This double ratio makes us wonder what exactly is Subsidy for CMS.
From the official website of centers for medicare & medicaid services, we found some information about subsidies. ’The Medicare Prescription Drug, Improvement and Modernization Act of 2003 (MMA), established the Medicare Prescription Drug Program, also known as Medicare Part D, making prescription drug coverage available to Medicare beneficiaries.’(9). They are committed to helping people with limited income and resources to provide help in prescribing drug costs. Common state application criteria include income of applicant and spouse not exceeding 150% of the Federal Poverty Level, resources of applicant and spouse not exceeding $11,010 for one person and $22,010 for a couple in 2009, and family size is also considered.(9) Resources at or below $6,600 for an individual and $9,910 for a couple and income below 135% of the Federal Poverty Level will entitle the applicant(s) to the full subsidy.(9) The criterion used in federal medicare savings programs is that the maximum resource does not exceed $4000 for one person and $6000 per couple. Maximum income is 135% of the federal poverty level. (9) Different states will modify income and resource rules for the MSP programs. The main considerations here include Income and Resource. The main reference used by Income is the federal poverty level (FPL), a data released annually by the Department of Health and Human Services (HHS). (10) According to the 2022 FPL, 2022 income numbers for individuals is $13,590 and for a family of 2 is $18,310. (10) For comparison, the median income across the country in 2022 is $44,225. (11) The resource includes It can be seen from this that the people who satisfy the subsidy of CMS are mainly low-income people, and they also have less liquid resources, which will lead them to become An important factor in home insecurity.
Another important factor is loans. In the top 20 features, we have ‘cons_stlindex’ and ‘cons_stlnindx’, which represent short term loan index and student loan index respectively.
As the figure below shows, we can tell groups with short term loan index 7 and 8 have an obviously lower probability to be house insecurity compared to others. Unfortunately, we do not have much information about this index and can’t conduct further research.

5.3 Disease and Medication

We will first discuss the general health situation of the participants for the indication of house insecurity. The two main features here are ‘cms_frailty_ind’ and ‘total_physician_office_allowed_pmpm_cost’.
‘cms_frailty_ind’ refers to whether a participation is deemed frail. The definition of frail is given in the description, including specific diagnoses, multiple serious chronic conditions, functional impairment or other factors. When a participation is described as ‘deemed frail’ , we think that this participation is not necessarily unhealthy, but it will be defined as ‘someone who is not very strong or healthy’, which can also be understood as this participation is not in the best state of its own body. The reasons for this include both physical and mentally reasons. We classify the data into two groups for analysis, one group includes participation without house insecurity and deemed frail and participation without house insecurity but with deemed frail. Another group includes participation with house insecurity but not deemed frail and participation with house insecurity and deemed frail. Among the people without home insecurity, the proportion of deemed frail is 3.6%. Among those in house insecurity, the proportion tripled, with 9.6% deemed frail.

5.3.1 Behavioral Diseases

We can tell there is a few differences between the mean as most of the participants have 0 claims per month related to behavioral diseases. However, from the outlier datas, we can tell for participants who do exist claims will have a higher average from the not home insecure group. This is kind of counterintuitive. From an existing study, the prevalence of poor mental health was 89% in women and 85.3% in men, which was much higher than that in the general population of the same area (19.5% and 14.5%, respectively).(14) It suggests that there is a higher probability of poor mental health for people under the home insecure. One factor that leads to this reverse in the data set may be related to the cost of diagnosing behavioral diseases. Diagnosis of mental health is expansive. According to GoodTherapy, Therapy generally ranges from $65 per hour to $250 or more for those without insurance.(15) A patient with major depression can spend an average of $10,836 a year on health costs. (16) The access to mental care is limited as well, with only 56% of psychiatrists accept commercial insurance compared to 90% of other, non-mental health physicians. (17) Combined all of these costs and limited resources, we think there is a stumbling block for those under house insecurity to seek mental health care and thus have relatively low claims per month related to mental, behavioral and neurodevelopmental disorders in the past one year. We can not simply trust what data tells us and should relate the data with the real world in order to better understand the data set.

5.3.2 Non-behavioral Diseases

This is a little strange, because from this set of data, participants with house insecurity seem to be healthier than other participants because they go to outpatient facilities and physician offices less frequently. So what is the reason for this result? According to the previous analysis of behavioral disease, we believe that limited resources and high unaffordable costs are still the main reasons, which leads to the failure of people with house insecurity to be diagnosed in time when they have potential non-behavioral diseases and get effective treatment. According to a previous study, the average cost of a single outpatient visit in the US is $500, compared to an average of $76 in Japan, an average of $157 in Canada and $170 in France. (18) This set of data comes from 2018. In the wake of COVID-19, the entire U.S. healthcare system has been hit by a pandemic. In the early to mid-term, a large number of hospitals are understaffed and medical equipment is short-term, which has led to difficulties for everyone to visit a doctor with a potential price rise. From our previous analysis of income and loan, it is not difficult to see that participants with house insecurity risks are generally more likely to have financial crises. For them, high medical and treatment costs are sometimes unaffordable. Therefore, this explains why participants who have not seen a doctor for a long time have a higher probability of becoming house insecurity.

5.4 Other Factors

In the data, there are fewer participants selected because of end stage renal disease and both. We mainly compare the proportion of age reasons and physical disability reasons. It can be seen that among people without house insecurity, the proportion of old age survivors insurance has reached 73.7%, and the proportion of disabled is only 26.2%. Among people with house insecurity, the proportion of old age survivors insurance has reached 56.8%, and the proportion of disabled is 43.0%.
According to this set of data, we can think that compared with other people, the higher proportion of people in house insecurity is because of disability, not old age, so there is a higher than normal proportion of people with disability in house insecurity in the group. This means that many of the disabled people are facing the problem of house insecurity, and this is also a mutual verification with a previous study. SSI is a federal program that provides cash assistance to people with significant, long-term disabilities and less than $2,000 in wealth. Approximately 4.8 million adults with disabilities between the ages of 18 and 64 receive SSI income. In most cases, SSI is the only source of income they have to meet their expenses. The report shows that SSI payments are too low for recipients to afford their housing and other necessities without other housing assistance, like a Housing Choice Voucher. (20)
We believe that the initial reasons for entering Medicare should be analyzed together with participants’ age. Among the non-house insecurity population, most of the participants entered Medicare because of age factors, which also corresponds to more than 90% of people over the age of 60. For reference, the average retirement age in the United States is between 60 and 64. (21) Among the remaining people under the age of 60, there is a relatively high proportion of house insecurity people. If it is not because of age, they have access to Medicare with higher probability because the disability enters Medicare. In other words, we believe that the reason for the high house insecurity rate among the 40-50 year olds participants is that they have a high proportion of disability. This is caused by bias in a database. People under retirement age have joined Medicare because of disability, and disabled people have a higher proportion of house insecurity, which leads to a higher probability of 40-50-year-old people becoming house insecurity.
So does it mean that age has no effect at all in house insecurity? According to a new study in 2021, many adults have been affected by house insecurity since the outbreak of COVID-19. Among them, the 65-79 aged group showed the highest ratio of behind on housing payment in the survey. (22) Therefore, different age groups indeed show different resistance to house insecurity. When we have more unbiased data, we can analyze the data set again to find out the impact of age on house insecurity.

6. Suggestions and Solutions

We must first realize that house insecurity is a problem that cannot be completely solved, but we must try our best to find a way to alleviate it. House insecurity is a very complex healthcare, economic and social issue. The situation that causes house insecurity will vary according to different cases. At the same time, external influences, such as the economic crisis and pandemic, will make house insecurity in different periods have different difficulties to solve. But we believe that the root causes of house insecurity can still be summarized. After analyzing a lot of existing studies, we think factors that will make people feel house insecurity are mainly related to the following points: cost burden, residential instability, neighborhood quality and overcrowding. So our solution will also set out to solve the above points. At the same time, we will supplement some additional information obtained from this data analysis, propose two suitable solutions for the current house insecurity and analyze the feasibility.
First of all, we need to sort out which players exist in the problem of house insecurity. According to the 2021 survey data, about 3.7 million people in the United States are facing housing insecurity problems. In addition to this, there are about 7.7 million adults behind on rent. (4) Through our data set provided by Humana this time, we believe that these house insecurity may have some of the following characteristics: significantly lower than average income and personal property, limited access to health care resources, poor physical and mental health, and possible disability. We will integrate these factors into the solution. We think there are two types of problem solvers who can effectively participate in alleviating this problem. One is government and policy makers, such as federal and state governments and government departments associated with healthcare. And the other is large healthcare organizations, including insurance companies like Humana and some non-profit charitable organizations.
For the government, their main solution will be fiscal policy. To put it simply, the government can regulate the situation of house insecurity by adjusting taxes for different groups of people, increasing the construction of public facilities, improving public housing, and adding loan systems with special conditions. Large-scale healthcare organizations can act as a bridge between the government and the people, to screen out people with house insecurity in a timely manner, and to help people with house insecurity to obtain appropriate help in a timely manner.
Let’s analyze it step by step.
The most direct problem faced by the house insecurity crowd is the cost burden. Cost burden means that the direct ratio of housing cost to income exceeds 30% (1). This means that people in house insecurity have relatively low income, and after paying the housing cost, they can only have less budget left to satisfy everyday life. One of the first things the government can do is make tax adjustments. Existing tax policies have already provided certain subsidies to low-income groups, such as some income tax exemptions (23). We do not think the adjustments here can be more effective.
Similar to the loan system, we believe that the current government is also trying its best to revise and improve loan-related policies to support low-income people to better borrow the money they need to meet their needs. We cannot expect the government and banks to unconditionally help low-income people through loans, which can only solve the problem of house insecurity in the short term. So what can the government do to further improve it?
We believe that the government should strengthen the improvement of public facilities and public housing.
First of all, we believe that the construction of public facilities is the most direct measure that a government can take. Under the influence of different policies, the government can appropriately adjust the expenditure of public finance, and more inclined to the construction of public facilities, which can directly solve the problem of neighborhood quality. For people in house insecurity, a big problem in life is the low quality of life and the surrounding living environment. This includes poor security, lack of surrounding medical facilities, inconvenient transportation and an overall poor environment. These are all related to the construction of public facilities. The government should pay more attention to local infrastructure and necessary living facilities in areas with higher probability of house insecurity. Strengthening more police patrols and installing more instant alarm buttons can effectively alleviate the local crime rate and help residents of the community gain a stronger sense of security.
Paying attention to the medical density in the region and investing more fiscal revenue into the construction of public hospitals can allow people with house insecurity to obtain more medical resources. From our data analysis, we can see that people with house insecurity are more likely to be more vulnerable and disabled than other groups, and they generally have limited medical resources. Setting up more public hospitals, adding medical staff, and repairing the medical conditions of the hospitals will effectively help the house insecurity population to have a higher-quality neighborhood. At the same time, a policy that may improve medical conditions is to unite hospitals in different regions to carry out more experience sharing. It can allow well-known hospitals with higher medical levels and small hospitals with less developed medical care to carry out one-to-one cooperation, so that a more efficient hospital administration system, treatment process and clinical experience can be shared with more hospitals.
Improvements in public transportation can also help people with house insecurity. In this day and age, no one ever seems to find it difficult to get out, whether it’s 5 miles or 500 miles, it doesn’t seem out of reach. It is true that in 2019, 93% of the households in the U.S had access to at least one car. The average American household owns at least one car. (25) On the other hand, according to a set of 2021 data, the income is at 50,000 People under the dollar account for less than 15% of all car owners in the United States. (24) Less than 5% of the population is under $30,000. Through our data analysis, we can know that low-income groups are quite closely related to house insecurity groups. Therefore, we speculate that these house insecurities are likely to face the unfavorable traffic caused by going out without a car. This means it will be more difficult than normal for them to buy food and necessities, to buy medicines and to seek medical care in a timely manner. Thinking a little further, they may also be more restricted to finding jobs, because it is difficult for them to go to work because there are timely job opportunities in inconvenient places. All of this will affect their quality of life and further result in lower incomes. This will undoubtedly increase their cost burden. And if for this reason they have to move to improve the convenience of getting out, this is linked to residential instability. Therefore, to better solve this problem, we believe that the government should allocate more budget to public transportation. We suggest that the government cooperate with universities or professional institutions to establish models through operation research to evaluate the habits of local people and the places they frequent, and build a better public transportation system. Such measures can improve the efficiency of public transportation, rather than blindly developing more bus routes or digging more subways. Such improvements can also better protect the environment.
In relation to the environment, the government should also focus on repairing greening in areas with more house insecurity problems and the construction of public sports and recreational facilities. Through our data, we can find that people with house insecurity generally have relatively limited medical resources to pay attention to mental health. We believe that improvements in neighbor quality can help people in house insecurity have healthier mental health. Through some existing studies, we can find that environmental pollution can lead to many mental illnesses, including depression, dementia, anxiety and suicide. (26) We encourage the government to improve greening, build parks and public fitness facilities to make house People in insecurity have better external living conditions, which can improve their quality of life. It also encourages them to have better physical and mental health.
In addition to public facilities, another point that the government needs to pay attention to is public housing. It is undeniable that the existing public housing of the US government is relatively complete. The current United States Department of Housing and Urban Development (HUD) has a public housing program that provides rents that people can afford for low-income people, the elderly, the disabled and some eligible families.(27) Currently, there are 970,000 households in the United States Living in public housing units. This number undoubtedly needs to be further increased in order to help more people in house insecurity. Compared to the UK and French urban areas, which have more than 25% of their households aided by national low-income housing assistance, the US urban area data is only 10%. (28) At the same time, some research points out that affordable housing subsidies are inequitably distributed in US urban areas. (28) Therefore, the government needs to pay more attention to the construction of public housing, because providing affordable housing is undoubtedly the most direct Help to the crowd of house insecurity.
More construction of public housing still cannot directly solve the problem, we also need to pay attention to how these created resources can be better passed on to the house insecurity crowd. Here, we think we need the help of a large healthcare organization like Humana. Large healthcare organizations have a wealth of medical-related data resources, rich experience in improving living conditions, and know how to deal with governments. Healthcare-related organizations can act as a bridge between the house insecurity population and the government, to effectively find groups in need and provide efficient resource assistance.
For the government that wants to help people in house insecurity, it is difficult to accurately locate who is eligible, so we can see that there is a complicated application process for applying for subsidies or applying for public housing. For people in house insecurity, it is difficult to find the resources they want as soon as possible, including housing and medical resources. The role that can more smoothly complete this connection is the healthcare organization. They are similar to the intermediary sales that connect real estate and buyers, and are also similar to investment banks that sell good projects to investors. With them, the solution of house insecurity is a closed loop.
Here we propose two things that large healthcare organizations can help to do. The first is to effectively screen out people facing house insecurity, and the second is to build a network platform to connect the people who need it and the resources they need.
We believe that compared with the government, large healthcare organizations will have the advantage to filter out the population of house insecurity in the data. In this data analysis, we can obtain a high probability to filter out the house insecurity population in the sample through the existing data and algorithms. With more data, we are confident that after spending time perfecting the model, we can analyze the population of house insecurity with higher accuracy. With the multi-dimensional data set provided by Humana, we can extract information from medical claims, personal finance, and a large amount of seemingly unrelated but actually related data, and build a model that can continuously improve and optimize itself for prediction. Such models will undoubtedly become better when different healthcare organizations cooperate with each other and with government databases. The current data still has bias and the sample size is not large enough. If there is better data set in the future, the healthcare organization has the ability to screen out the population of house insecurity more efficiently and accurately than now. This can help the government save a lot of time. Sometimes when the government has the budget to solve this social problem, it finds that it is difficult to start because it does not know who is really in need. Through these predictions, policies that help house insecurity can be advanced more efficiently.
The second effective measure is to build an information platform to collect and disclose resources that can help house insecurity, and help people in house insecurity to apply for the resources they need more quickly. We believe that the United States now has a lot of subsidies and help policies for people in house insecurity, including HUD’s public housing or local subsidy vouchers. However, in many cases, people in house insecurity have limited information collection resources, which makes them unable to approach what they need in time. In addition to the government, a large number of public organizations and large enterprises are also helping the house insecurity crowd as much as possible. For example, Humana invested 25 million dollars this year. In order to integrate these resources more effectively, we propose to build a website to list all relevant policies and help projects, and then categorize them. In this way, after the people in house insecurity fill out their personal information, they don’t have to spend time reapplying for different help programs, and they can see all the appropriate policies and local help measures at one time.Through the optimization of the data, the website can recommend to the house insecurity crowd with the most suitable and the closet plan, which also saves them the time of searching and comparing. Another advantage is that such a website can be made into a mobile application, so that the people in house insecurity can know the information anytime and anywhere, and can directly start the application on their cell phones. As a government and an institution, when an application is received, it can also conduct preliminary screening and verification by analysis of data, which saves the time and cost of manual service. This allows for the quickest help. The speed at which problems are resolved for the house insecurity crowd is very important, as every point will be a pain in the face of a lack of housing accommodation.

7. Conclusion and Further Study

In this study, we fully explored the data set from Humana to analyze the social problem of house insecurity. With the existing data, we apply several models to digest the data, including logistic, random forest and XGBoost. After tuning hyperparameters, we utilize XGBoost as our final model to effectively predict the house insecurity existence among participants. We also ranked the top 20 lists features that will contribute to the prediction of house insecurity in our model. Combined with other previous studies, we claim there are several factors that people under house insecurity share in common. These factors include relative low-income and limited personal properties, fragile body with poor physical and mental condition, limited medical resources and possible disabilities. With these circumstances in mind, we make some recommendations that incorporate some of the research that has been done on the house insecurity population.

We believe that the government should strengthen the construction of public facilities, including the improvement of hospitals and public transportation systems. At the same time, the government should also set up more public housing, which can most directly meet the needs of the house insecurity crowd. In addition, large healthcare institutions can also play a vital role. They can act as a bridge to connect the house insecurity crowd with government-provided resources. We hope that these institutions can fully integrate and use their existing data to more efficiently screen out house insecurity people, and then build a network platform to provide convenient resource search and application help channels. We don’t think house insecurity is a problem that can be completely solved, but through these we think we can help improve a better world and help more people live better lives than they do today.

In the future, we believe that the following points can be further studied to improve our model and achieve better prediction of the house insecurity population. We hope to further understand the important features in our model, such as figuring out what the short term loan index and the students loan index mean respectively. At the same time, we believe that the data currently used is not only unbalanced, but also has various biases in data sources. If there are more databases to analyze, we have the confidence to make better predictive models.

We also believe that further communication with relevant government departments will allow us to better understand some of the current policies and operations that help to solve house insecurity. In the case of time, we think that only by going through a personal experience, such as applying for public housing, can you know the difficulties of approaching these resources. Finally, we suggest that relevant agencies can do a large-scale survey to understand the biggest difficulties in life of these people who have become house insecure, and it will be most efficient to apply help to the most needed problems.

‘Housing is absolutely essential to human flourishing. Without stable shelter, it all falls apart.’(29)

We sincerely hope that fewer and fewer people will suffer from house insecurity.

8. Reference

https://health.gov/healthypeople/priority-areas/social-determinants-health/literature-summaries/housing-instability
https://en.wikipedia.org/wiki/Housing_insecurity_in_the_United_States
https://www.huduser.gov/portal/pdredge/pdr-edge-frm-asst-sec-111918.html
https://endhomelessness.org/resource/housing-insecurity-rent-relief-eviction-assistance/
https://cepr.net/report/housing-insecurity-by-race-and-place-during-the-pandemic/
https://www.macrotrends.net/countries/USA/united-states/gdp-per-capita
https://www.fool.com/the-ascent/research/average-house-price-state/
https://press.humana.com/news/news-details/2022/Humana-Expands-National-Commitment-to-Affordable-Housing-With-Additional-25-Million-Investment/#gsc.tab=0
​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​https://www.cms.gov/Medicare/Eligibility-and-Enrollment/LowIncSubMedicarePresCov/Downloads/StateLISGuidance021009.pdf
https://www.healthcare.gov/glossary/federal-poverty-level-fpl/
https://worldpopulationreview.com/state-rankings/average-family-income
https://pubmed.ncbi.nlm.nih.gov/15031310/
https://www.hopkinsmedicine.org/health/wellness-and-prevention/stay-strong-four-ways-to-beat-the-frailty-risk
https://link.springer.com/article/10.1007/s11524-022-00619-5
https://www.goodtherapy.org/blog/faq/how-much-does-therapy-cost
https://onemindatwork.org/at-work/the-business-case/
https://www.cnbc.com/2021/05/10/cost-and-accessibility-of-mental-health-care-in-america.html
https://www.benefitspro.com/2018/12/27/the-cost-of-an-outpatient-visit-500/?slreturn=20220915105653
https://www.ssa.gov/oact/progdata/describeoasi.html
https://nlihc.org/resource/people-disabilities-face-significant-affordability-challenges-rental-market
https://crr.bc.edu/briefs/what-is-the-average-retirement-age/
https://www.jchs.harvard.edu/blog/older-renters-color-have-experienced-high-rates-housing-insecurity-during-pandemic
https://www.taxpolicycenter.org/briefing-book/how-does-federal-tax-system-affect-low-income-households
https://www.statista.com/statistics/1041177/us-car-owners-by-income-group/
https://www.thezebra.com/resources/research/car-ownership-statistics/
https://www.unep.org/news-and-stories/story/caring-environment-helps-care-your-mental-health
https://www.hud.gov/topics/rental_assistance/phprog
https://www.urban.org/urban-wire/lessons-overseas-could-improve-uss-affordability-crisis
https://twitter.com/housing_hope/status/1467912416092635145