import pandas as pd

//anaconda/lib/python3.6/site-packages/ipykernel/__main__.py:26: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

End of dataframe reached

//anaconda/lib/python3.6/site-packages/numpy/core/_methods.py:82: RuntimeWarning: Degrees of freedom <= 0 for slice
  warnings.warn("Degrees of freedom <= 0 for slice", RuntimeWarning)
//anaconda/lib/python3.6/site-packages/numpy/core/_methods.py:94: RuntimeWarning: invalid value encountered in true_divide
  arrmean, rcount, out=arrmean, casting='unsafe', subok=False)
//anaconda/lib/python3.6/site-packages/numpy/core/_methods.py:116: RuntimeWarning: invalid value encountered in double_scalars
  ret = ret.dtype.type(ret / rcount)
//anaconda/lib/python3.6/site-packages/numpy/core/_methods.py:59: RuntimeWarning: Mean of empty slice.
  warnings.warn("Mean of empty slice.", RuntimeWarning)
//anaconda/lib/python3.6/site-packages/numpy/core/_methods.py:70: RuntimeWarning: invalid value encountered in double_scalars
  ret = ret.dtype.type(ret / rcount)

count    722.000000
mean       0.918040
std        0.185344
min       -1.515708
25%        0.924215
50%        0.958641
75%        0.975608
max        1.041034
Name: lower_conf_interval, dtype: float64

(1087, 70)

//anaconda/lib/python3.6/site-packages/ipykernel/__main__.py:1: FutureWarning: sort(columns=....) is deprecated, use sort_values(by=.....)
  if __name__ == '__main__':

24.3836246550138

//anaconda/lib/python3.6/site-packages/ipykernel/__main__.py:1: FutureWarning: sort(columns=....) is deprecated, use sort_values(by=.....)
  if __name__ == '__main__':

<matplotlib.collections.PathCollection at 0x11fa3aa90>

[<matplotlib.lines.Line2D at 0x12326a710>]

Millenials and Investment: an Ongoing Exploration

In case you haven’t heard, millenials are killing everything from diamonds to department stores to designer crap to grocery chains.

Why? Sure, the recession had an impact. But also, millenials pay more attention to ethics than many multinational corporations bargain for. They cite the blood diamond trade as a major reason to spring for non-traditional engagement rings. They opt for grocery providers that can tell them where their food is coming from and under what conditions it was produced. They’re ditching the fast fashion industry for higher-priced items purchased secondhand on sites like Poshmark and ThreadUp.

For millenials, investing and values go hand in hand

Source for image

And as millenials reach the age where they might accrue some savings, it makes sense that they would care about where that is going, too. In addition to millenial attendance at the NoDAPL protests, we saw thousands of millenials divest from Western Union, Bank of America, and other banks that loaned money to the project. Maybe megacorps won’t change their tunes because a few thousand people stood in a field to get mowed down by water cannons, but they’re more likely to sit up and listen when those same people take their hard-earned doll hairs to another playhouse.

So we see that millenials are surveying their options to spend and save according to their values. What about investing? Any personal finance 101 that isn’t taught by a financial advisor will recommend a low cost index as the place to stick extra money so it can grow with the market. Most index funds, including the most recommended one (Vanguard), decide their investments via index-matching: matching their holdings to the S&P500 by market cap, with no other variables. Thing is, plenty of investors are expressing interest in taking ethical considerations into account. Some portfolios do this by blanket blocking investments in certain industries like tobacco or porn. Other more advanced optsions, like Betterment’s AutoSRI portfolio, use actual ESG data to determine where they invest the money. There isn’t (yet) a fully customizable option to allow folks to automatically invest their funds based on a checklist of their individual values. For a while, I’ve thought about building a toy version of what that might look like.

When I talk about the idea with friends and relatives, I get the following objection: ‘What about the returns?’ Touche. Nobody wants to lose out on their potential earnings. At first, I figured I’d build a tolerance into the system that allowed investors to say ‘These are my values, but please don’t invest in a way that will trail general market performance by more than x percent.’ The algorithm would then predict stock performance for each company, somehow blend that with ESG rating, and come up with a combined weight for divvying up investment money.

Before I build that, though, I need to test the assumption that high ESG ratings do correlate negatively with returns. If they don’t, there’s no need for the tolerance measure in the first place.

I’m not the first person to run correlations along these lines. Dorfleitner, Utz, and Wimmer published a paper on this just last year. Their analysis suggests that higher corporate social responsibility ratings increase returns over a long period of time (“long” being a 12 year period from 2002-2014). They even identify three specific areas that correlate with higher than average returns: emission and resource reduction, workforce, and society. So in my exploration, I’ll dig into some specific CSR breakdowns with the data I have on S&P 500 companies.

import pandas as pd

Correlating KLD ESG Ratings to Stock Performance, 1990-2005

Let’s determine whether we notice any correlation between companies’ environmental, social, and governmental ratings and their stock performance.

First, we pull in the ESG data.

These come from KLD and are now distributed by MSGI. I pulled them from an academic database. Don’t rerun this notebook because I didn’t push the actual data to Github, on account of it is large and on account of both data providers ask corporations to pay for the data. So I’m not going to undermine that.

y91 = pd.read_excel('../stockproject/12231046.1990-1999.stats/1991 HistoricalSpreadsheet_STATS.xls')
y92 = pd.read_excel('../stockproject/12231046.1990-1999.stats/1992 HistoricalSpreadsheet_STATS.xls')
y93 = pd.read_excel('../stockproject/12231046.1990-1999.stats/1993 HistoricalSpreadsheet_STATS.xls')
y94 = pd.read_excel('../stockproject/12231046.1990-1999.stats/1994 HistoricalSpreadsheet_STATS.xls')
y95 = pd.read_excel('../stockproject/12231046.1990-1999.stats/1995 HistoricalSpreadsheet_STATS.xls')
y96 = pd.read_excel('../stockproject/12231046.1990-1999.stats/1996 HistoricalSpreadsheet_STATS.xls')
y97 = pd.read_excel('../stockproject/12231046.1990-1999.stats/1997 HistoricalSpreadsheet_STATS.xls')
y98 = pd.read_excel('../stockproject/12231046.1990-1999.stats/1998 HistoricalSpreadsheet_STATS.xls')
y99 = pd.read_excel('../stockproject/12231046.1990-1999.stats/1999 HistoricalSpreadsheet_STATS.xls')

nineties = [y91, y92, y93, y94, y95, y96, y97, y98, y99]

y00 = pd.read_excel('../stockproject/12231046.2000-2009.stats/2000 HistoricalSpreadsheet_STATS.xls')
y01 = pd.read_excel('../stockproject/12231046.2000-2009.stats/2001 HistoricalSpreadsheet_STATS.xls')
y02 = pd.read_excel('../stockproject/12231046.2000-2009.stats/2002 HistoricalSpreadsheet_STATS.xls')
y03 = pd.read_excel('../stockproject/12231046.2000-2009.stats/2003 HistoricalSpreadsheet_STATS.xls')
y04 = pd.read_excel('../stockproject/12231046.2000-2009.stats/2004 HistoricalSpreadsheet_STATS.xls')
y05 = pd.read_excel('../stockproject/12231046.2000-2009.stats/2005 Historical Spreadsheet_STATS.xls') #wth KLD
y06 = pd.read_excel('../stockproject/12231046.2000-2009.stats/2006 Historical Spreadsheet_STATS.xls')
y07 = pd.read_excel('../stockproject/12231046.2000-2009.stats/2007 HistoricalSpreadsheet_STATS.xls')
y08 = pd.read_excel('../stockproject/12231046.2000-2009.stats/2008 HistoricalSpreadsheet_STATS.xls')
y09 = pd.read_excel('../stockproject/12231046.2000-2009.stats/2009 HistoricalSpreadsheet_STATS.xls')


two_thousands = [y00, y01, y02, y03, y04]

I wanted to run 1990-2009, but evidently starting in 2005 these spreadsheets no longer represent whether a company was in the S&P500 in the same way. That’s okay: we can do this for a fifteen-year period wrangle in more data later if we would like to see a longer period of time.

Now let’s pull out the companies that belong to the S&P 500. We’ll begin by examining their ESG scores on four metrics: employment policy strengths, employment policy concerns, environmental impact strengths, and environmental impact concerns. These companies get a score of zero (0) or one (1) each year on each of several sub-metrics. For example, employment policy strengths include specific scores for workplace safety, compensation, union management, et cetera.

We’re going to sum up all of the sub-metrics for each metric per company, and then we’re going to sum that company’s total scores in that metric across our fifteen year time span. We’ll end up with a company score of cumulative strengths and concerns in employment and environmental practices over the course of the fifteen years.

twenty_years = nineties + two_thousands

def filter_500(df):
    return df[df['SP500'] == True]

sp500_90s = []    
for data in twenty_years:
  sp500_90s.append(filter_500(data))

aggregate_data = pd.concat(sp500_90s)
aggregate_data['EMP-str-sum'] = aggregate_data[['EMP-str-A','EMP-str-B','EMP-str-C','EMP-str-D','EMP-str-F','EMP-str-G']].sum(axis=1)
aggregate_data['EMP-con-sum'] = aggregate_data[['EMP-con-A','EMP-con-B','EMP-con-C','EMP-con-D']].sum(axis=1)
aggregate_data['ENV-str-sum'] = aggregate_data[['ENV-str-A','ENV-str-B','ENV-str-C','ENV-str-D','ENV-str-F','ENV-str-G']].sum(axis=1)
aggregate_data['ENV-con-sum'] = aggregate_data[['ENV-con-A','ENV-con-B','ENV-con-C','ENV-con-D','ENV-con-E','ENV-con-F']].sum(axis=1)
aggregate_data.head()

OK, so here’s our data. Let’s take a look at this data and make sure we’re getting what we want: a sum of the ESG scores on a company-by-company basis.

def sum_scores_for(dataframe, esg_marker):
    grouping = dataframe.groupby(['Ticker'])[esg_marker].sum()
    return pd.DataFrame({esg_marker : grouping}).reset_index()

esg_marker_data = sum_scores_for(aggregate_data, 'EMP-str-sum')
esg_marker_data.head()

Ah, these sums look like what we would expect to see!

esg_marker_data['EMP-con-sum'] = sum_scores_for(aggregate_data, 'EMP-con-sum')['EMP-con-sum']
esg_marker_data['ENV-str-sum'] = sum_scores_for(aggregate_data, 'ENV-str-sum')['ENV-str-sum']
esg_marker_data['ENV-con-sum'] = sum_scores_for(aggregate_data, 'ENV-con-sum')['ENV-con-sum']

esg_marker_data.sort_values(by=['EMP-con-sum'], ascending=False).head()

Second, we pull in stock performance data.

This data contains stock returns by quarter for S&P500 companies dating back to 1979. We’ll pull the columns for the ’90s for now.

price_data = pd.read_excel('../stockproject/Cleaned_Researcher_Dataset.xlsx')

new_header = price_data.iloc[0] #grab the first row for the header
content = price_data[1:] #take the data less the header row
content.columns = new_header #set the header row as the df header
content.head()

tickers = content.iloc[:,0:2]
tickers.columns = list(new_header)[0:2]

dates = content.iloc[:,45:106]
dates.columns = list(new_header)[45:106]

result = pd.concat([tickers, dates], axis=1)
result.head()

Third, we translate these stock prices into returns.

I want a metric that I can use to compare all the companies that a) belonged to the S&P 500 and b) earned some kind of KLD scores during the 1990-2005 period. Some companies only belong to the S&P500 for a subset of the years in question. We want a metric that will not penalize companies based on having spent less time in the S&P500, so a cumulative score won’t work for us. I decided to calculate quarterly returns based on the stock prices. This fairly compares each company’s stock performance during the period that an index-matching ETF would have held it, however long or short that was.

This is also nice because our mean function will only consider, for each company, those cells that have a number. So we don’t have to do as much data skullduggery to get the equation functions to spit out something meaningful.

def quarter_return(start, end):
    if start == 0 or end == 0:
        return 0
    return end / start

#WARNING: This has to go column by column because the sequence in time matters.
#Such an iterative operation takes longer than async-per-column pandas operations.
#Expect this block of code to take several seconds to run.
raw_stock_prices = result
returns_df = raw_stock_prices[['Ticker', 'Company Name']]

#We cannot easily index the columsn by name
#because the column names are datetimes rather than strings,
#so we use column index instead.
for column_name in raw_stock_prices.iloc[:,2:]:
    loc = raw_stock_prices.columns.get_loc(column_name)
    this_column = raw_stock_prices.iloc[:,loc]
    next_col = loc + 1
    try:
        next_column = raw_stock_prices.iloc[:, next_col]

        temp_df = pd.concat([this_column, next_column], axis=1)
        temp_df.columns = ['a', 'b']
    
        returns_df['quarter_starting_' + column_name.strftime('%m/%d/%Y') + '_roi'] = (
        temp_df.apply(lambda row: quarter_return(row['a'], row['b']), axis=1))
    except:
        print('End of dataframe reached')

returns_df.head()

//anaconda/lib/python3.6/site-packages/ipykernel/__main__.py:26: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

End of dataframe reached

Fourth, we combine the data into one dataframe.

We find all the companies for which we have both stock price data and ESG data, and we put the information together.

tickers = list(esg_marker_data["Ticker"]) #get all the company tickers for which we have esg data
prices_for_esg_companies = returns_df[returns_df["Ticker"].isin(tickers)] #get the stock data from companies in that list

relevant_esgs = esg_marker_data[esg_marker_data["Ticker"].isin(list(prices_for_esg_companies["Ticker"]))]
relevant_esgs = relevant_esgs.fillna(0.0) #no esg score to zero esg score

all_data = pd.concat([relevant_esgs, prices_for_esg_companies.iloc[:,1:]], axis = 1) #put the esg and stock data in one dataframe

all_data.head()

Fifth, we look at the data.

import matplotlib.pyplot as plt
import random
import seaborn as sns
%matplotlib inline

Our research question, recall, is this: do companies with low aggregated ESG scores have better stock performance than companies with high aggregated ESG scores?

We have so far calculated the return on investment of every stock, every quarter. We’re not ready to use that data, though. If we use each return individually, we’ll end up comparing each example (that is, each company) multiple times. Multiple comparisons in statistics gives you more opportunities to end up with false positives because it gives you more cracks at hitting that 5% probability lottery (if p-0.05) of your results being a fluke.

Instead, we want a way to generalize over all of its quarterly returns for those quarters in which each stock had a return.

Easy! Take an average, right? Well, not quite.

First of all, an average skews toward big outliers. If most quarters showed an ROI of 1.01 and then one quarter has an ROI of 2, the average will skew much higher than 1.01.

Pursuant to that, smaller numbers of measurements skew toward the extremes. So if a company only has three returns, each individual return affects the average a lot. That means, if one of the three is super high, the whole average will be super high. Comparatively, if a company has 200 returns and one of them is an outlier, it won’t drag the average nearly as far. In datasets where examples possess a different number of measurements, the highest and lowest values for the target variable often come from samples with few measurements.

We are in a prime position to encounter that gotcha, because some of these companies are much more measured (many more quarterly returns) than others.

So instead, we’re going to do something that might look a little weird.

Confidence Intervals

Instead of going off the average of our set of returns, we’ll take the confidence interval for it. A confidence interval attempts to take into account the fact that our data doesn’t quite match the real world: it samples the real world such that we can attempt to represent the real world in studies. So the “true average” of something might not look like the average of our measurements of it, and the fewer measurements we have the less sure we can be of the discrepancy. A confidence interval takes our average and says ‘based on this average, the true average lies between this number and this number with this probability.’

So, for our stock return measurements, we’ll get the confidence interval for each company. For companies with few measurements, that confidence interval will be wide. For companies with many measurements, it will be more narrow. We’ll then sample that pessimistically and compare the bottoms of all the companies’ confidence intervals.

It’s worth noting that this is going to give us a very low number for companies with few returns and without a high outlier. We’ll address that as well.

# Calculate average roi
returns = all_data[all_data.columns.difference(['EMP-str-sum', 'EMP-con-sum','ENV-str-sum','ENV-con-sum','Ticker','Company Name'])]
returns.head()

def confidence_interval_for(samples=[], confidence=0.95):
    sample_size = len(samples)
    degrees_freedom = sample_size - 1
    outlier_tails = (1.0 - confidence) / 2.0
    t_distribution_number = -1 * t.ppf(outlier_tails, degrees_freedom)

    step_1 = np.std(samples)/math.sqrt(sample_size) 
    step_2 = step_1 * t_distribution_number
    
    low_end = np.mean(samples) - step_2
    high_end = np.mean(samples) + step_2

    return low_end, high_end

import math
from scipy.stats import t
import numpy as np

lower_conf_intervals = []
upper_conf_intervals = []

for (idx, row) in returns.iterrows():
    maybe_measurements = row.tolist()
    measurements = [float(x) for x in maybe_measurements if (math.isnan(float(x)) == False)]
    bottom, top = confidence_interval_for(measurements)
    lower_conf_intervals.append(bottom)
    upper_conf_intervals.append(top)

//anaconda/lib/python3.6/site-packages/numpy/core/_methods.py:82: RuntimeWarning: Degrees of freedom <= 0 for slice
  warnings.warn("Degrees of freedom <= 0 for slice", RuntimeWarning)
//anaconda/lib/python3.6/site-packages/numpy/core/_methods.py:94: RuntimeWarning: invalid value encountered in true_divide
  arrmean, rcount, out=arrmean, casting='unsafe', subok=False)
//anaconda/lib/python3.6/site-packages/numpy/core/_methods.py:116: RuntimeWarning: invalid value encountered in double_scalars
  ret = ret.dtype.type(ret / rcount)
//anaconda/lib/python3.6/site-packages/numpy/core/_methods.py:59: RuntimeWarning: Mean of empty slice.
  warnings.warn("Mean of empty slice.", RuntimeWarning)
//anaconda/lib/python3.6/site-packages/numpy/core/_methods.py:70: RuntimeWarning: invalid value encountered in double_scalars
  ret = ret.dtype.type(ret / rcount)

all_data['avg_quarterly_roi'] = returns.mean(axis=1, skipna=True)
all_data['num_measurements'] = returns.count(axis=1)
all_data['lower_conf_interval'] = lower_conf_intervals
all_data['upper_conf_interval'] = upper_conf_intervals

all_data.head(10)

Before we go on, let’s talk statistical power.

The statistical power of a dataset describes the likelihood that you will be able to detect a meaningful difference with this data given that the difference is there. It’s an important precursor step to data analysis because it determines your ability to find the effect you’re looking for.

Calculating statistical power is a little complicated, and it’s generally a good idea to get someone who knows what they’re doing to help you. I’m not confident in my ability to evaluate statistical power calculation strategies yet, but I figured I’d start with the all-around method and used this online calculator.

I performed a one-tail statistical power test. I took the mean of the bottom end of confidence intervals on stock returns—0.92. Suppose this were only the mean for companies with favorable ESG scores, and suppose that companies with unfavorable ESG scores had a marginally but meaningfully higher quarterly return of 0.93. So 0.92 is my sample average and 0.93 is my test value.

The sample size is how many examples we have: 1087.

Our standard deviation for sample is…

all_data['lower_conf_interval'].describe()

count    722.000000
mean       0.918040
std        0.185344
min       -1.515708
25%        0.924215
50%        0.958641
75%        0.975608
max        1.041034
Name: lower_conf_interval, dtype: float64

…0.12. so we plug that in.

I left the confidence level at 5% and ran the calculator.

My result: 86.5%

That’s the probability, if there is a 1% difference or more between stock returns for companies with favorable and unfavorable ESG scores, we have an 86.5% chance of detecting it. Statistical power expectations vary, but a rule of thumb (and standard practice at some more rigorous scientific journals) is to expect a statistical power of 80% or higher to take a study seriously.

OK, time to make some pictures! Let’s plot our ESG score sums against average roi and see if we notice any trends.

f, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, sharex='col', sharey='row')
f.suptitle('ESG Sums Plotted Against Returns')
ax1.scatter(x=all_data['EMP-str-sum'], y=all_data['lower_conf_interval'])
ax2.scatter(x=all_data['EMP-con-sum'], y=all_data['lower_conf_interval'])
ax3.scatter(x=all_data['ENV-str-sum'], y=all_data['lower_conf_interval'])
ax4.scatter(x=all_data['ENV-con-sum'], y=all_data['lower_conf_interval'])
all_data.shape

(1087, 70)

We have a couple of really low ones on those charts. Let’s look at them:

all_data[['Company Name','lower_conf_interval', 'num_measurements']].sort('lower_conf_interval').head(10)

//anaconda/lib/python3.6/site-packages/ipykernel/__main__.py:1: FutureWarning: sort(columns=....) is deprecated, use sort_values(by=.....)
  if __name__ == '__main__':

We spoke a little bit earlier about the way our measuring strategy would especially penalize companies with only a few measurements. We’re seeing that here. Out of these bottom 10 scores for the lower end of stock return confidence intervals, only one has more than 7 data points. Compare this to the mean number of measurements:

all_data['num_measurements'].mean()

24.3836246550138

But there’s another thing to note about this data: five of the values are negative. These scores are low, yes, but a negative value for a stock return does not make sense. The value of a stock can drop to zero, but it does not turn into debt to the company owed by the stockholders. The lowest a stock price can drop is to zero. To accurately represent that, we’ll clip these values at zero.

all_data['lower_conf_interval'] = all_data['lower_conf_interval'].clip_lower(0)

Let’s check the means too to make sure we don’t have negative values in the means:

all_data[['Company Name','avg_quarterly_roi', 'num_measurements']].sort('avg_quarterly_roi').head(10)

//anaconda/lib/python3.6/site-packages/ipykernel/__main__.py:1: FutureWarning: sort(columns=....) is deprecated, use sort_values(by=.....)
  if __name__ == '__main__':

Cool: we don’t. This means that none of the upper ends of the confidence intervals will have negative values either, as the upper end of the confidence interval is always higher than the mean on which it is based.

Let’s check out our plot again with those adjustments:

f, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, sharex='col', sharey='row')
f.suptitle('ESG Sums Plotted Against Returns')
ax1.scatter(x=all_data['EMP-str-sum'], y=all_data['lower_conf_interval'])
ax2.scatter(x=all_data['EMP-con-sum'], y=all_data['lower_conf_interval'])
ax3.scatter(x=all_data['ENV-str-sum'], y=all_data['lower_conf_interval'])
ax4.scatter(x=all_data['ENV-con-sum'], y=all_data['lower_conf_interval'])

<matplotlib.collections.PathCollection at 0x11fa3aa90>

Let’s look at the general trends of the returns. They look pretty flat to me: I don’t see an upward or downward trend in quarterly ROI based on any of the 4 ESG metrics.

Let’s dig a little deeper and see if the numbers themselves support that.

esg_sum_zero = all_data[all_data['EMP-str-sum'] == 0.0]
twenty_examples = esg_sum_zero.sample(20)

esg_sum_above_zero = all_data[all_data['EMP-str-sum'] > 0.0]

emp_str_sum = pd.concat([esg_sum_above_zero, twenty_examples])

The thing about a lot of our data is that it skews heavily toward one side or the other. For example, there are a lot more companies with low employment policy strength scores than high ones. How do we account for this when we do our aggregations? In the code below, I try to separate strata of ESG scores into buckets of similar size. That means a lot of the higher scores get bucketed together, while zero (0) or one(1) get their own buckets.

Here’s a question though—why do buckets at all? Why not run a linear regressor on the data as-is and call it good?

Good question! A regressor can frequently be a preferable option to drawing arbitrary distinctions in continuous data. This isn’t exaclty continuous data as the ESG aggregations are all natural numbers, but that is not the reason for the buckets. The reason for the buckets is to have each of our aggregated data points come from roughly the same number of data examples. If we run a regressor, the bottom of the regressor represents a ton of data points, and as we go higher it represents fewer and fewer data points such that the expected value up there means very little.

I’d be very concerned to artificially divide this data into only two buckets such that points close to the arbitrary division get lumped in with very different means. So we’ll break it into as many buckets as we reasonably can such that each bucket of data points contains data points that are roughly similar to one another in their ESG scores.

def buckets(score):
    if score == 0.0:
        val = '0' 
    if 0.0 < score < 2.0:
        val = '1' 
    if 1.0 < score < 4.0:
        val = '2-3' 
    if 3.0 < score < 6.0:
        val = '4-5' 
    if 5.0 < score < 8.0:
        val = '6-7'
    if 7.0 < score < 11.0:
        val = '8-10'
    if 10.0 < score:
        val = 'Above 11' 
    return val

emp_str_sum['emp_str_sum_grouping'] = emp_str_sum['EMP-str-sum'].apply(buckets)
emp_str_sum.head()

And here are our initial results for employment policy strength scores and quarterly ROIs!

emp_str_sum_groupby = emp_str_sum.groupby(['emp_str_sum_grouping']).agg({'lower_conf_interval': ['count','mean','std']}).reset_index()
emp_str_sum_groupby.sort_values('emp_str_sum_grouping')

My buckets are not perfectly the same sizes, but the means are not orders of magnitude apart. They’re all in the 0.91-0.94 range, whether the company had a very high employment policy strength score or a sorry zero.

emp_str = emp_str_sum_groupby['emp_str_sum_grouping']
emp_str_qtrly_roi = emp_str_sum_groupby['lower_conf_interval']['mean']
annual_view = pd.concat([emp_str, emp_str_qtrly_roi], axis=1).reset_index()
annual_view.columns = ['index','emp_str_sum_grouping', 'low_quarterly_estimate_roi']
annual_view

Unfortunately each of the data have distinct distributions of values, which means that it’s not trivial to modularize the bucketing strategy. We have to bucket them all differently to get roughly equal buckets. So we’re going to be writing similar code several times over here. Were we doing this analysis continuously in, say, an app, we would find a better way to do this.

esg_sum_zero = all_data[all_data['EMP-con-sum'] == 0.0]
eighteen_examples = esg_sum_zero.sample(18)

esg_sum_above_zero = all_data[all_data['EMP-con-sum'] > 0.0]

emp_con_sum = pd.concat([esg_sum_above_zero, eighteen_examples])

def buckets(score):
    if score == 0.0:
        val = '0' 
    if 0.0 < score < 3.0:
        val = '1-2' 
    if 2.0 < score < 5.0:
        val = '3-4'
    if 4.0 < score:
        val = '5 and up'
    return val

emp_con_sum['emp_con_sum_grouping'] = emp_con_sum['EMP-con-sum'].apply(buckets)
emp_con_sum.head()

emp_con_sum_groupby = emp_con_sum.groupby(['emp_con_sum_grouping'])[['lower_conf_interval']].agg(['count', 'mean','std']).reset_index()
emp_con_sum_groupby.sort_values('emp_con_sum_grouping')

emp_con = emp_con_sum_groupby['emp_con_sum_grouping']
emp_con_qtrly_roi = emp_con_sum_groupby['lower_conf_interval']['mean']
annual_view = pd.concat([emp_con, emp_con_qtrly_roi], axis=1).reset_index()
annual_view.columns = ['index','emp_con_sum_grouping', 'low_quarterly_estimate_roi']
annual_view

Especially uneven bucketing here, unfortunately. I could take a random sampling of the larger buckets. Maybe in the next round.

Anyway, this is employment concerns (scores for treating workers poorly).

esg_sum_zero = all_data[all_data['ENV-str-sum'] == 0.0]
twenty_examples = esg_sum_zero.sample(20)

esg_sum_above_zero = all_data[all_data['ENV-str-sum'] > 0.0]

env_str_sum = pd.concat([esg_sum_above_zero, twenty_examples])

def buckets(score):
    if score == 0.0:
        val = '0' 
    if 0.0 < score < 4.0:
        val = '1-3' 
    if 3.0 < score < 10.0:
        val = '4-9'
    if 9.0 < score:
        val = '9 and up'
    return val

env_str_sum['env_str_sum_grouping'] = env_str_sum['ENV-str-sum'].apply(buckets)
env_str_sum.head()

env_str_sum_groupby = env_str_sum.groupby(['env_str_sum_grouping'])[['lower_conf_interval']].agg(['count', 'mean','std']).reset_index()
env_str_sum_groupby.sort_values('env_str_sum_grouping')

env_str = env_str_sum_groupby['env_str_sum_grouping']
env_str_qtrly_roi = env_str_sum_groupby['lower_conf_interval']['mean']
annual_view = pd.concat([env_str, env_str_qtrly_roi], axis=1).reset_index()
annual_view.columns = ['index','env_str_sum_grouping', 'low_quarterly_estimate_roi']
annual_view

Here we have environmental impact strength. I could stratify these buckets more to get a more granular view.

env_con_sum = all_data

def buckets(score):
    val = score
    if score == 0.0:
        val = '0' 
    if score == 1.0:
        val = '1' 
    if score == 2.0:
        val = '2' 
    if 2.0 < score < 5.0:
        val = '3-4'
    if 4.0 < score < 8.0:
        val = '5-7'
    if 7.0 < score < 13.0:
        val = '8-12'
    if 12.0 < score < 22.0:
        val = 'Above 12, below 21'
    if 21.0 < score:
        val = 'Above 21'
    return val

env_con_sum['env_con_sum_grouping'] = env_con_sum['ENV-con-sum'].apply(buckets)
env_con_sum.head()

env_con_sum_groupby = env_con_sum.groupby(['env_con_sum_grouping'])[['lower_conf_interval']].agg(['count', 'mean','std']).reset_index()
env_con_sum_groupby.sort_values('env_con_sum_grouping')

env_con = env_con_sum_groupby['env_con_sum_grouping']
env_con_qtrly_roi = env_con_sum_groupby['lower_conf_interval']['mean']
annual_view = pd.concat([env_con, env_con_qtrly_roi], axis=1).reset_index()
annual_view.columns = ['index','env_con_sum_grouping', 'low_quarterly_estimate_roi']
annual_view

Finally, we have environmental concerns. In this one, the zero category is WAY bigger than the other categories. We could take a sample of this to alleviate that, if we wanted to.

f, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, sharex='col', sharey='row')
f.suptitle('ESG Sums Plotted Against Returns')
ax1.plot(emp_str_qtrly_roi)
ax2.plot(emp_con_qtrly_roi)
ax3.plot(env_str_qtrly_roi)
ax4.plot(env_con_qtrly_roi)

[<matplotlib.lines.Line2D at 0x12326a710>]

How fascinating. No clear trend in stock prices on the strengths categories, but higher concern ratings for environmental and employment practices correlate negatively with stock performance.

Sixth, we determine what our results could mean.¶

OK friends, it’s time to figure out what our data means, or rather, whether it means anything at all with respect to our question: do companies with higher str ESG scores and lower con ESG scores have lower stock returns than companies that perform more poorly on ESG metrics?

First, let’s calculate our confidence intervals for stock performance and determine if the stock performances for companies low ESG scores fall outside the confidence intervals for the stock performance of companies with high ESG scores.

import math 
from scipy.stats import t
import numpy as np

emp_str_sum_groupby

labels=emp_str_sum_groupby['emp_str_sum_grouping'] 
sample_size = emp_str_sum_groupby['lower_conf_interval']['count']
standard_deviation = emp_str_sum_groupby['lower_conf_interval']['std']
mean = mean = emp_str_sum_groupby['lower_conf_interval']['mean']

def confidence_interval_for_collection(sample_size=[], standard_deviation=[], mean=[], confidence=0.95):
    degrees_freedom = [count - 1 for count in sample_size] 
    outlier_tails = (1.0 - confidence) / 2.0
    confidence_collection = [outlier_tails for _ in sample_size]
#     Why the HELL are the T-distribution values given as negative numbers??
    t_distribution_number = [-1 * t.ppf(tails, df) for tails, df in zip(confidence_collection, degrees_freedom)]

    step_1 = [std/math.sqrt(count) for std, count in zip(standard_deviation, sample_size)]
    step_2 = [step * t for step, t in zip(step_1, t_distribution_number)]

    low_end = [mean_num - step_num for mean_num, step_num in zip(mean, step_2)]
    high_end = [mean_num + step_num for mean_num, step_num in zip(mean, step_2)]

    return low_end, high_end
                                    
low_end, high_end = confidence_interval_for_collection(sample_size, standard_deviation, mean)
emp_str_confidence_interval_data = pd.concat([labels, sample_size, standard_deviation, mean], axis=1)
emp_str_confidence_interval_data['.95 confidence level min'] = low_end
emp_str_confidence_interval_data['.95 confidence level max'] = high_end
emp_str_confidence_interval_data

labels=emp_con_sum_groupby['emp_con_sum_grouping'] 
sample_size = emp_con_sum_groupby['lower_conf_interval']['count']
standard_deviation = emp_con_sum_groupby['lower_conf_interval']['std']
mean = mean = emp_con_sum_groupby['lower_conf_interval']['mean']
 
low_end, high_end = confidence_interval_for_collection(sample_size, standard_deviation, mean)
emp_con_confidence_interval_data = pd.concat([labels, sample_size, standard_deviation, mean], axis=1)
emp_con_confidence_interval_data['.95 confidence level min'] = low_end
emp_con_confidence_interval_data['.95 confidence level max'] = high_end
emp_con_confidence_interval_data

labels=env_str_sum_groupby['env_str_sum_grouping'] 
sample_size = env_str_sum_groupby['lower_conf_interval']['count']
standard_deviation = env_str_sum_groupby['lower_conf_interval']['std']
mean = mean = env_str_sum_groupby['lower_conf_interval']['mean']
 
low_end, high_end = confidence_interval_for_collection(sample_size, standard_deviation, mean)
env_str_confidence_interval_data = pd.concat([labels, sample_size, standard_deviation, mean], axis=1)
env_str_confidence_interval_data['.95 confidence level min'] = low_end
env_str_confidence_interval_data['.95 confidence level max'] = high_end
env_str_confidence_interval_data

labels=env_con_sum_groupby['env_con_sum_grouping'] 
sample_size = env_con_sum_groupby['lower_conf_interval']['count']
standard_deviation = env_con_sum_groupby['lower_conf_interval']['std']
mean = mean = env_con_sum_groupby['lower_conf_interval']['mean']
 
low_end, high_end = confidence_interval_for_collection(sample_size, standard_deviation, mean)
env_con_confidence_interval_data = pd.concat([labels, sample_size, standard_deviation, mean], axis=1)
env_con_confidence_interval_data['.95 confidence level min'] = low_end
env_con_confidence_interval_data['.95 confidence level max'] = high_end
env_con_confidence_interval_data

In all four cases, there is significant overlap between the confidence intervals for the stock performances of the companies with the lowest and the highest ESG scores. That is, we can’t be confident at all that stocks from the high ESG score groups will have higher or lower returns than the stocks from the low ESG score groups.

But wait! That doesn’t necessarily mean that there is no meaningful difference. The confidence intervals can overlap for two series of points whose data still has a meaningful difference. To determine whether it does, we need to perform additional significance testing—in this case, a t test.

How to do the independent samples t test

def t_test_for(num_samples_1, standard_deviation_1, mean1, num_samples_2, standard_deviation_2, mean2, confidence=0.95):
  alpha = 1 - confidence
  total_degrees_freedom = num_samples_1 + num_samples_2 - 2

  t_distribution_number = -1 * t.ppf(alpha, total_degrees_freedom)

  degrees_freedom_1 = num_samples_1 - 1
  degrees_freedom_2 = num_samples_2 - 1
  sum_of_squares_1 = (standard_deviation_1 ** 2) * degrees_freedom_1
  sum_of_squares_2 = (standard_deviation_2 ** 2) * degrees_freedom_2

  combined_variance = (sum_of_squares_1 + sum_of_squares_2) / (degrees_freedom_1 + degrees_freedom_2)
  first_dividend_addend = combined_variance/float(num_samples_1)
  second_dividend_addend = combined_variance/float(num_samples_2)

  denominator = math.sqrt(first_dividend_addend + second_dividend_addend)
  numerator = mean1 - mean2
  t_value = float(numerator)/float(denominator)

  accept_null_hypothesis = abs(t_value) < abs(t_distribution_number) #results are not significant

  return accept_null_hypothesis, t_value

We have written a method that will tell us, firstly, whether to accept or reject the null hypothesis, which assumes no meaningful difference between the two sets of data we want to compare. I have named that output ‘accept_null_hypothesis’ because I don’t love the ubiquitous use of the confounding phrase ‘reject the null hypothesis’ in scientifi inquiry. It’s a double negative (reject the absence of meaningful difference), which adds an unnecessary additional piece of mental acrobatics to the (already frequently herculean) task of determining what, exactly, the scientists are trying to say in their conclusion paragraph.

We are going with accept the absence of meaningful difference as the variable name for two reasons. First of all, we remove the double negative this way. Second of all, accepting the null hypothesis is (or should be) the outcome of the vast majority of scientific inquiry. Scientists, collectively, test a whole bunch of stuff to see what has an effect. Most of the things tried, it turns out, don’t have that effect. So our accept_null_hypothesis value will usually be true. When it’s false, we should sit up and pay attention.

n1 = emp_str_confidence_interval_data['count'][0]
std1 = emp_str_confidence_interval_data['std'][0]
mean1 = emp_str_confidence_interval_data['mean'][0]

n2 = emp_str_confidence_interval_data['count'][6]
std2 = emp_str_confidence_interval_data['std'][6]
mean2 = emp_str_confidence_interval_data['mean'][6]

accept_null_hypothesis, t_value = t_test_for(n1, std1, mean1, n2, std2, mean2)
print(accept_null_hypothesis)
print(t_value)

True
1.1701056052229704

n1 = emp_con_confidence_interval_data['count'][0]
std1 = emp_con_confidence_interval_data['std'][0]
mean1 = emp_con_confidence_interval_data['mean'][0]

n2 = emp_con_confidence_interval_data['count'][3]
std2 = emp_con_confidence_interval_data['std'][3]
mean2 = emp_con_confidence_interval_data['mean'][3]

accept_null_hypothesis, t_value = t_test_for(n1, std1, mean1, n2, std2, mean2)
print(accept_null_hypothesis)
print(t_value)

True
1.249654566596557

n1 = env_str_confidence_interval_data['count'][0]
std1 = env_str_confidence_interval_data['std'][0]
mean1 = env_str_confidence_interval_data['mean'][0]

n2 = env_str_confidence_interval_data['count'][3]
std2 = env_str_confidence_interval_data['std'][3]
mean2 = env_str_confidence_interval_data['mean'][3]

accept_null_hypothesis, t_value = t_test_for(n1, std1, mean1, n2, std2, mean2)
print(accept_null_hypothesis)
print(t_value)

True
0.12761158561661315

How about that! So var, accepting the null hypothesis everywhere. It appears that we cannot meaninfgully separate the stock performance of companies on either end of the spectrum for each of these ESG scores.

n1 = env_con_confidence_interval_data['count'][0]
std1 = env_con_confidence_interval_data['std'][0]
mean1 = env_con_confidence_interval_data['mean'][0]

n2 = env_con_confidence_interval_data['count'][7]
std2 = env_con_confidence_interval_data['std'][7]
mean2 = env_con_confidence_interval_data['mean'][7]

accept_null_hypothesis, t_value = t_test_for(n1, std1, mean1, n2, std2, mean2)
print(accept_null_hypothesis)
print(t_value)

False
1.6638008075952688

Ah! But check out this one! It just so happens that companies that raise a lot of environmental concerns, as measured by ESG scores, have meaningfully worse stock performances than companies that do not.

Conclusion

On this pass, we have taken a much more thorough look at our data with respect to how well it approximates the real world and how certain we are about that approximation.

This did not mean tacking on a test for statistical significance at the end: it meant rethinking our analysis from the very beginning, including what our target variable would be (previously extrapolated annual roi, now back to mean quarterly roi) and also how we would represent it (with the bottom end of the confidence interval instead of the mean itself).

It also meant checking our results more thoroughly along the way, identifying unrealistic values in the data, understanding why those values were what they were, and replacing them with values that better approximate reality.

Finally, it meant conducting a t test rather than eyeballing the results.

The result here provides an excellent example of the difference between how data might be interpreted or perceived versus what it really represents. When we use the bottom of the confidence intervals, we’re looking at a lot of losses in shareholder value. It’s important to note that this isn’t really what happened. It is, more accurately, a statistically derived worst-case scenario for a company that did everything that a given example company did, based on the metrics we have about how that company actually did. That is not useful for predicting what the stock returns for the example company will be in the future (fun fact: trying to predict stock returns in general is not that useful, as straight-up index matching outperforms the vast majority of shrewder, prediction-based investment strategies in the long term. I don’t miss our loss of predictive value because there wasn’t much demonstrated value there to begin with.)

At any rate, what it is useful for is accurately representing the fact that there is a lot we do not know and attempting to account for that in a comparison of stock returns relative to ESG ratings. We’re not trying to predict returns by company: we’re trying to determine if companies with favorable ESG scores, in aggregate, underperform on the stock market relative to companies with unfavorable ESG scores. For this, our data is useful.

And, as we’ve established, we don’t see in this analysis a meaningful difference in most cases. In the one case where we do see a meaningful difference, the companies with favorable ESG scores are in fact outperforming the companies with unfavorable ESG scores. All of this bodes well for the application question—can I create a socially conscious investment portfolio without sacrificing returns on my investment?

There’s more to look at to get a more definitive answer to this question. But so far, what we’re seeing points to yes.

Additional Notes

While I am much happier with our process in this run of the notebook, I am highly dissatisfied with the state of the code. It’s very scripty, parts of it feel repetitive, and it all has to be run in order because I reused variable names. I would call the work up to this point an exploration phase. If I were to ship this work for any kind of reuse, I’d like it to look nothing like this. Maybe we’ll make that a follow-up post later about this notebook, or maybe we’ll focus more on that in our next note book series.

	ALC-con-X	BMS	CGOV-con-G	CGOV-con-H	CGOV-con-I	…	Russell2000	SP500	TOB-con-X	Ticker	EMP-str-sum	ENV-con-sum
0	NR	NaN	NR	NR	NR	…	NaN	True	NR	ABT	0	1
1	NR	NaN	NR	NR	NR	…	NaN	True	NR	AMT	0	0
3	NR	NaN	NR	NR	NR	…	NaN	True	NR	AMD	1	1
4	NR	NaN	NR	NR	NR	…	NaN	True	NR	AET	1	0
6	NR	NaN	NR	NR	NR	…	NaN	True	NR	AHM	0	0

	Ticker	EMP-str-sum
0	A	6
1	AA	19
2	AAL	1
3	AAPL	7
4	ABC	0

	Ticker	Company Name	1989-12-31 00:00:00	1990-03-31 00:00:00	1990-06-30 00:00:00	1990-09-30 00:00:00	1990-12-31 00:00:00	1991-03-31 00:00:00	1991-06-30 00:00:00	1991-09-30 00:00:00	…	2002-09-30 00:00:00	2002-12-31 00:00:00	2003-03-31 00:00:00	2003-06-30 00:00:00	2003-09-30 00:00:00	2003-12-31 00:00:00	2004-03-31 00:00:00	2004-06-30 00:00:00	2004-09-30 00:00:00	2004-12-31 00:00:00
1	A	Agilent Technologies Inc.	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	…	13.06	17.96	13.15	19.55	22.11	29.24	31.63	29.28	21.57	24.1
2	AA	Alcoa Inc	75	64.75	63.75	62.63	57.63	65.5	67.5	63.88	…	19.3	22.78	19.38	25.5	26.16	38	34.69	33.03	33.59	31.42
3	AAL	American Airlines Group	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	…	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
4	AAL (Alex)	Alexander & Alexander	30.63	26.75	24.38	17.88	23.13	26.25	22.5	20.88	…	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
5	AAP	Advance Auto Parts Inc	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	…	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN

	Ticker	Company Name	quarter_starting_12/31/1989_roi	quarter_starting_03/31/1990_roi	quarter_starting_06/30/1990_roi	quarter_starting_09/30/1990_roi	quarter_starting_12/31/1990_roi	quarter_starting_03/31/1991_roi	quarter_starting_06/30/1991_roi	quarter_starting_09/30/1991_roi	…	quarter_starting_06/30/2002_roi	quarter_starting_09/30/2002_roi	quarter_starting_12/31/2002_roi	quarter_starting_03/31/2003_roi	quarter_starting_06/30/2003_roi	quarter_starting_09/30/2003_roi	quarter_starting_12/31/2003_roi	quarter_starting_03/31/2004_roi	quarter_starting_06/30/2004_roi	quarter_starting_09/30/2004_roi
1	A	Agilent Technologies Inc.	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	…	0.552220	1.375191	0.732183	1.486692	1.130946	1.322479	1.081737	0.925703	0.736680	1.117293
2	AA	Alcoa Inc	0.863333	0.984556	0.982431	0.920166	1.136561	1.030534	0.94637	1.007827	…	0.582202	1.180311	0.850746	1.315789	1.025882	1.452599	0.912895	0.952148	1.016954	0.935397
3	AAL	American Airlines Group	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	…	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
4	AAL (Alex)	Alexander & Alexander	0.873327	0.911402	0.733388	1.293624	1.134890	0.857143	0.92800	0.981801	…	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
5	AAP	Advance Auto Parts Inc	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	…	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN

	Ticker	EMP-str-sum	EMP-con-sum	ENV-str-sum	ENV-con-sum	Company Name	quarter_starting_12/31/1989_roi	quarter_starting_03/31/1990_roi	quarter_starting_06/30/1990_roi	quarter_starting_09/30/1990_roi	…	quarter_starting_06/30/2002_roi	quarter_starting_09/30/2002_roi	quarter_starting_12/31/2002_roi	quarter_starting_03/31/2003_roi	quarter_starting_06/30/2003_roi	quarter_starting_09/30/2003_roi	quarter_starting_12/31/2003_roi	quarter_starting_03/31/2004_roi	quarter_starting_06/30/2004_roi	quarter_starting_09/30/2004_roi
0	A	6.0	4.0	0.0	0.0	NaN	NaN	NaN	NaN	NaN	…	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
1	AA	19.0	0.0	15.0	37.0	Agilent Technologies Inc.	NaN	NaN	NaN	NaN	…	0.552220	1.375191	0.732183	1.486692	1.130946	1.322479	1.081737	0.925703	0.736680	1.117293
2	AAL	1.0	0.0	0.0	0.0	Alcoa Inc	0.863333	0.984556	0.982431	0.920166	…	0.582202	1.180311	0.850746	1.315789	1.025882	1.452599	0.912895	0.952148	1.016954	0.935397
3	AAPL	7.0	7.0	2.0	0.0	American Airlines Group	NaN	NaN	NaN	NaN	…	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
4	ABC	0.0	0.0	0.0	0.0	NaN	NaN	NaN	NaN	NaN	…	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN

Chelsea Troy

Analyzing Relationships Between ESG Data and Stock Performance, Part 3

Millenials and Investment: an Ongoing Exploration

Correlating KLD ESG Ratings to Stock Performance, 1990-2005

First, we pull in the ESG data.

Second, we pull in stock performance data.

Third, we translate these stock prices into returns.

Fourth, we combine the data into one dataframe.

Fifth, we look at the data.

Confidence Intervals

Sixth, we determine what our results could mean.¶

Conclusion

Additional Notes

Like this:

Leave a ReplyCancel reply

	Ticker	EMP-str-sum	EMP-con-sum	ENV-str-sum	ENV-con-sum
150	CAT	16	23	1	10
847	UNP	4	18	2	28
649	PD	0	17	0	26
501	LPX	1	17	3	17
443	IP	0	16	15	36

	quarter_starting_03/31/1990_roi	quarter_starting_03/31/1991_roi	quarter_starting_03/31/1992_roi	quarter_starting_03/31/1993_roi	quarter_starting_03/31/1994_roi	quarter_starting_03/31/1995_roi	quarter_starting_03/31/1996_roi	quarter_starting_03/31/1997_roi	quarter_starting_03/31/1998_roi	quarter_starting_03/31/1999_roi	…	quarter_starting_12/31/1994_roi	quarter_starting_12/31/1995_roi	quarter_starting_12/31/1996_roi	quarter_starting_12/31/1997_roi	quarter_starting_12/31/1998_roi	quarter_starting_12/31/1999_roi	quarter_starting_12/31/2000_roi	quarter_starting_12/31/2001_roi	quarter_starting_12/31/2002_roi	quarter_starting_12/31/2003_roi
0	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	…	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
1	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	…	NaN	NaN	NaN	NaN	NaN	NaN	0.561279	1.226236	0.732183	1.081737
2	0.984556	1.030534	1.076312	1.085271	1.020941	1.207952	0.916174	1.108529	0.958291	1.502306	…	0.479049	1.18438	1.066667	0.977693	0.552441	0.846386	1.073134	1.061603	0.850746	0.912895
3	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	…	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
4	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	…	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN

	Company Name	lower_conf_interval	num_measurements
1282	Valero Energy	-1.515708	2
480	Fred Meyer Inc.	-1.307858	2
1019	Rational Software	-1.054724	3
1354	Young & Rubicam	-0.536857	2
164	BMC Software	-0.065493	24
260	CIT Group Inc.	0.014914	2
928	Palm Inc.	0.124468	7
1070	Sapient Corp.	0.261729	7
1291	Vitesse Semiconductor	0.266343	6
641	Immunex Corp.	0.348569	3

	Company Name	avg_quarterly_roi	num_measurements
28	Adaptec Inc.	0.720171	5
1291	Vitesse Semiconductor	0.732021	6
573	Harcourt Brace Jovanovich	0.781262	7
1070	Sapient Corp.	0.806787	7
1282	Valero Energy	0.826730	2
682	JWP Inc.	0.833895	12
928	Palm Inc.	0.845871	7
292	Conexant Systems	0.874049	8
575	HBO & Co.	0.913608	4
166	Battle Mountain Gold Co	0.915304	11

	emp_str_sum_grouping	lower_conf_interval
		count	mean	std
0	0	9	0.960676	0.020428
1	1	39	0.924618	0.110909
2	2-3	41	0.945451	0.049264
3	4-5	34	0.933286	0.089118
4	6-7	27	0.942882	0.065201
5	8-10	30	0.934034	0.085127
6	Above 11	38	0.927873	0.082855

	Ticker	EMP-str-sum	EMP-con-sum	ENV-str-sum	ENV-con-sum	Company Name	quarter_starting_12/31/1989_roi	quarter_starting_03/31/1990_roi	quarter_starting_06/30/1990_roi	quarter_starting_09/30/1990_roi	…	quarter_starting_09/30/2003_roi	quarter_starting_12/31/2003_roi	quarter_starting_03/31/2004_roi	quarter_starting_06/30/2004_roi	quarter_starting_09/30/2004_roi	avg_quarterly_roi	lower_conf_interval	upper_conf_interval	emp_con_sum_grouping
0	A	6.0	4.0	0.0	0.0	NaN	NaN	NaN	NaN	NaN	…	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	3-4
3	AAPL	7.0	7.0	2.0	0.0	American Airlines Group	NaN	NaN	NaN	NaN	…	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	5 and up
7	ABS	9.0	8.0	0.0	0.0	NaN	NaN	NaN	NaN	NaN	…	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	5 and up
9	ABX	4.0	3.0	0.0	3.0	NaN	NaN	NaN	NaN	NaN	…	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	3-4
16	ACY	2.0	3.0	0.0	13.0	NaN	NaN	NaN	NaN	NaN	…	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	3-4

	emp_con_sum_grouping	lower_conf_interval
		count	mean	std
0	0	10	0.950467	0.025037
1	1-2	55	0.936785	0.060575
2	3-4	41	0.936625	0.079496
3	5 and up	39	0.904615	0.114478

	env_str_sum_grouping	lower_conf_interval
		count	mean	std
0	0	8	0.925646	0.076902
1	1-3	41	0.934895	0.078974
2	4-9	44	0.927810	0.088072
3	9 and up	23	0.920900	0.094553

	env_con_sum_grouping	lower_conf_interval
		count	mean	std
0	0	252	0.932306	0.107379
1	1	17	0.946235	0.069817
2	2	12	0.939598	0.051256
3	3-4	22	0.939741	0.052115
4	5-7	27	0.931011	0.083693
5	8-12	21	0.929778	0.072504
6	Above 12, below 21	26	0.925077	0.089422
7	Above 21	19	0.888986	0.134940

	emp_str_sum_grouping	count	std	mean	.95 confidence level min	.95 confidence level max
0	0	9	0.020428	0.960676	0.944974	0.976378
1	1	39	0.110909	0.924618	0.888665	0.960570
2	2-3	41	0.049264	0.945451	0.929901	0.961000
3	4-5	34	0.089118	0.933286	0.902191	0.964381
4	6-7	27	0.065201	0.942882	0.917089	0.968675
5	8-10	30	0.085127	0.934034	0.902247	0.965821
6	Above 11	38	0.082855	0.927873	0.900639	0.955107

	Ticker	EMP-str-sum	EMP-con-sum	ENV-str-sum	ENV-con-sum
150	CAT	16	23	1	10
847	UNP	4	18	2	28
649	PD	0	17	0	26
501	LPX	1	17	3	17
443	IP	0	16	15	36

	emp_con_sum_grouping	count	std	mean	.95 confidence level min	.95 confidence level max
0	0	10	0.025037	0.950467	0.932557	0.968378
1	1-2	55	0.060575	0.936785	0.920410	0.953161
2	3-4	41	0.079496	0.936625	0.911533	0.961717
3	5 and up	39	0.114478	0.904615	0.867505	0.941724

	env_str_sum_grouping	count	std	mean	.95 confidence level min	.95 confidence level max
0	0	8	0.076902	0.925646	0.861355	0.989938
1	1-3	41	0.078974	0.934895	0.909968	0.959822
2	4-9	44	0.088072	0.927810	0.901034	0.954586
3	9 and up	23	0.094553	0.920900	0.880013	0.961788

	env_con_sum_grouping	count	std	mean	.95 confidence level min	.95 confidence level max
0	0	252	0.107379	0.932306	0.918984	0.945628
1	1	17	0.069817	0.946235	0.910338	0.982132
2	2	12	0.051256	0.939598	0.907031	0.972164
3	3-4	22	0.052115	0.939741	0.916635	0.962847
4	5-7	27	0.083693	0.931011	0.897903	0.964118
5	8-12	21	0.072504	0.929778	0.896775	0.962782
6	Above 12, below 21	26	0.089422	0.925077	0.888959	0.961195
7	Above 21	19	0.134940	0.888986	0.823947	0.954025

Analyzing Relationships Between ESG Data and Stock Performance, Part 3

Millenials and Investment: an Ongoing Exploration

Correlating KLD ESG Ratings to Stock Performance, 1990-2005

First, we pull in the ESG data.

Second, we pull in stock performance data.

Third, we translate these stock prices into returns.

Fourth, we combine the data into one dataframe.

Fifth, we look at the data.

Confidence Intervals

Sixth, we determine what our results could mean.¶

Conclusion

Additional Notes

Share this:

Like this:

Leave a ReplyCancel reply

Discover more from Chelsea Troy

	Ticker	EMP-str-sum	EMP-con-sum	ENV-str-sum	ENV-con-sum
150	CAT	16	23	1	10
847	UNP	4	18	2	28
649	PD	0	17	0	26
501	LPX	1	17	3	17
443	IP	0	16	15	36