Skip to main content

Project - Phase 3

·1348 words·7 mins
Guha Mahesh
BhuvanHospet
SotaShimizu
CarterVargas

Phase 3
#

Changes from Phase 2:
#

  • After speaking with our professors, we cleaned the concepts for our three user personas:
    • We now have switched the student to an Economist
    • We have also cemented the different interactions between the different personas
    • We have fully comitted to our app being Economic Policy related as opposed to all policy.
  • We added our wireframes as we didn’t have them in our previous blog post
  • We corrected our features and targets on the second model to be Fiscal policy such as government spending, and the features are percent of GDP that the government spent on various different sectors

Features for the Machine Learning Models
#

Model 1: GDP Per Capita (Linear Regression)
#

GDP (Gross Domestic Product) per capita is a metric that measures a country’s economic output per person The features for this model included:

  • Current health spending expressed as % of previous year’s GDP
  • Current education spending expressed as % of previous year’s GDP
  • Previous year’s military spending (% of government expenditure)
  • The previous year (captures time-based economic trends)
  • Country

This model predicts this year’s GDP per Capita.

ex: Health spending: 8% (current spending as % of previous year’s GDP) Education spending: 5% (current spending as % of previous year’s GDP) Military spending: 3% (previous year’s spending) Country: Germany Previous year: 2023 In this case this model would predict Germany’s 2024 GDP per capita based on these lagged spending patterns

We researched many features (mainly through the world bank) and found that most of the data related to gdp was related to these 3 features. When plotting the data, we found that there was definetly a linear trend, however, country had a large impact on the final GDP of a model.

Model 2: S&P Index and Currency Exchange Rates (Linear Regression)
#

The features for this model included:

  • Federal Discount Rate for the United States
  • The Federal Balance Sheet Size (Billions $)
  • Treasury Securities Holdings (Billions $)
  • The Lagged Months: 1mo, 2mos, 3mos, 6mos, 9mos We played around with various monetary policy features in addition to Net Export data. However, the net export data didn’t match up with the theme of monetary and fiscal policy. Additionally, the federal funds rate seemingly didn’t have as much correlation

Model 2 Plots
#

QQ Plots:
#

image
Before testing other assumptions, we decided to check for normality in the residuals. As you can see in the six residual QQ plots, the points generally lie close to a straight line, with minimal variability around the line. This suggests that the residuals are approximately normally distributed.

Assumption Plots
#

SP500 Before
#

image

SP500 After
#

image
As you can see in the first plot, we had very clear patterns in our residual plots and an extremely skewed residual histogram. This is not ideal as it means that our model doesn’t meet any of our standards:

  • There is no linearity as the residuals are not evenly distributed around 0
  • There is no homoscedasticity as the variables are clustered around 0
  • There is also heavy autocorrelation as the top right graph clearly moves in a pattern with time. However, after we added 6 different time lag features, the second plot showed much more promising trends It is important to acknowledge that there is still very clear patterns, but there is a very clear improvement over the original. The residuals are much more concentrated around 0 and there is less of a pattern with date/time. Additionaly, the plot portrays a bit more homoscedasticity.

For the model which uses these same features to predict currency exchanges, there would be too many plots to show effectively, so we aggregated them and showed the averages.

Currencies Before
#

image

Currencies After
#

image
As you can see, the same 6 lag strategy worked very effectively on the Currency data as well. In the first plot, there is a large amount of spread about the histogram in addition to clear homoscedasticity and autocorrelation. The second plot seems to mostly mitigate these with the six lags however.

GDP Per Capital Model
#

To start we plotted each feature (as a % of previous year’s GDP) to predict next year’s gdp. This involved finding the previous year’s GDP and the current year’s GDP to calculate the current total spending and putting that in terms of the previous year’s gdp.

image
image
image

These plots showed a generally linear correlation (with the exception of Russia), so we decided to remove Russia from our dataset because it showed a clear opposite trend. This graph also revealed that time and country were a major feature and that we should definetly account for it in our model.

image
image
image

We also plotted a line graph of each feature over time to see if there is a trend over time. We found that there was enough correlation to account for the trend over time.

image
image
image
image
image
image

After training and creating our linear regression model, we ended up with these residual plots for residuals vs x values and the order residual plots for each feature, which also took into account country. For all 3 features, the residual plots show a linear relationship with equal datapoints above and below the mean. All 3 have index residual plots that aren’t entirely random, showing slight curves and outliers, but this is likely due to how countries have very different situations and that these features alone can’t entirely predict a complicated factor such as GDP. The model meets the linearity, no autocorrelation and homoscedasticity for the most part, however, it doesn’t entirely meet the no multicollinearity assumption entirely. The features all have similar trends, which is likely due to the fact that some countries chose to invest more in their government entirely, which means all 3 features increase, rather than increase individual features. This is especially true for health and education spending, which have similar trends.

Our Integrated ML Model
#

important to note that these are three stitched screenshots, not one page
#

image
Here, you can see us selecting our features for both of our ML models (One predicting SP500 and exchanges with Monetary Policy and the other predicting GDP Per Capita with Fiscal Policy) The frontend has quite a bit of room to be more visually pleasing, but the pages are functional as of now.

Flask API Routes
#

Our team has implemented a variety of Rest API Routes that are used across our application. As seen in the table, we have incorportated GET, POST, PUT, and DELETE routes that provide many functionalities to our app.

image

“GET” Routes were espeically useful for filtering data from the database. Using queries containing MYSQL code, we were able to use SELECT statements with conditions to display specific policies the databse. Users are then able favorite policies for further research on a different page.

image

The “POST” and “Delete” route can also be used here to modify which policies are added and removed from the Favorite_Policies table. This example uses a foreign key and JOIN command to link the two tables.

image

Updated ER Diagram
#

We greatly improved the structure of our ER diagram (Entity Relationship) after receiving feedback on the previous phase. Here is what we currently have:

image

The database now accounts for individual users, and this allowed us to do much more with the API to make the website more user-specific, and make a realistic app setting (ex. Saved/favorited features are only shown for the user that saved/favorited).

Streamlit
#

So far, our team has put together three working personas with mostly complete functionality. Each persona has been updated to include at least two interactive pages. The frontend uses all 4 types of API routes so the user can directly interact with the database through the web app.

Here are some examples of pages that were not already shown above:

image
Screen for the Lobbyist, allows them to write down a note on a conversation they had with a selected Politician

image
Another Lobbyist screen, allows them to view multiple conversations they had. Needs to be updated to include the model, however.

image
The current results screen for the Policy Maker, we will be implementing more visuals soon.