Week 04 - Histograms, Functions, Groups¶

Sections 31 and 36, solutions at dchotai.github.io/resources

# Import some modules to use
import numpy as np
from datascience import *

%matplotlib inline
import matplotlib.pyplot as plt
plt.style.use("fivethirtyeight")

This week, we'll use data from Basketball-Reference again that describes the average statistics of NBA players for the 2016-17 season. Here's the key again for reference:

Label	Meaning
Pos	Position (Point Guard, Shooting Guard, Small Forward, Power Forward, Center)
Tm	Team abbreviation
G / GS	Games played / Games started
MP	Minutes played
FG[A][%]	Field goals [attempted] [% made]
3P / 2P	Three-point / two-point field goals
FT	Free throws
[O][D][T]RB	Offensive / Defensive / Total rebounds
AST	Assists
STL	Steals
BLK	Blocks
TOV	Turnovers
PF	Personal Fouls
PPG	Points per game

# Just run this
nba = Table().read_table('nba_16_17.csv').drop('ORB', 'DRB').where('FGA', are.above(0))
nba

array([18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42])

Most accurate team: GSW

Highest Average FG%: SF, Highest average PPG: SF

Histograms¶

Histograms are used to visualize the distribution of numerical values. We use bins to group numerical variables into continuous intervals. Bins are inclusive of their lower bounds but exclusive of their upper bounds, which is often expressed as [lower, upper). We place bins on the x-axis and count the number of numerical values that lie in the bins. Note that bin widths may not always be equal, so it is always important to pay attention to the ranges on the x-axis.

The height of each bar measures the proportion of data in the bin relative to the amount of space in the bin. This is another key reason histograms are different than bar charts, because the height of each bar is relative to the width of the bin. In other words, the height of a bar refers to the density, or how crowded the bin is.

The formula for the height of a bar is

Height of bar = (Proportion of data in the bin) / (width of the bin).

Rearranging this formula gives us the formula for the area of a bar (which corresponds to the proportion of data in the bin), which is

Area of bar = (Height of bar) x (Width of bar)

Generally for histograms, to determine the number of items in a bin, it makes sense to calculate the area of the bar, which corresponds to the proportion of total values in the bin. You can then multiply this proportion by the total number of items in the distribution to get the number of items that are in the desired bin. When determining the density of each bin, we use the height of bars.

When creating histograms using the Table.hist method, the first argument we pass in is the label of the column of desired values. By default, the Table.hist method automatically determines appropriate bins for the histogram based on the numerical values passed in. If you wish to use your own bins, you can pass in an array of values using the optional bins=YOUR_BINS_ARRAY parameter.

1a.¶

Let's visualize the distribution of the ages of last season's NBA players. NBA players must be at least 19 years old to play on a team. The oldest player last season was 40 years old. Assign age_bins to an array of bin values that describe the ages of NBA players with an appropriate bin width.

Hint: To make equally sized bins, we can use the np.arange function.

# SOLUTION
# Any bin widths in the interval [1, 5] should work fine
# A slightly narrower range of bins, such as np.arange(19, 41, 2) is also valid so long as no values are excluded
age_bins = np.arange(18, 44, 2)
age_bins

array([18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42])

To visualize the distribution of player ages, create a histogram of the ages using the age_bins you just created.

# SOLUTION
nba.hist('Age', bins=age_bins)

1b.¶

Let's look at the distribution of points scored per game. Create a histogram of points scored per game using the bins defined in the cell below.

points_bins = np.arange(0, 35, 5)
# SOLUTION
nba.hist('PPG', bins=points_bins)

Based on this histogram, roughly what percentage of NBA players scored between 10 (inclusive) and 15 (exclusive) points per game?

SOLUTION: We know that the area of a bar corresponds to the proportion of data in the bin. Recall that the formula to calculate the area of a bin is to multiply the width and height of the bin. The width of the [10, 15) bin is 5. We don't have the exact height of the bar, but fortunately we only need an estimate of the percentage of players that scored between 10 and 15 points per game. We can eyeball the height of the bar and see that it is roughly 4. Therefore, the percentage of players that scored between 10 and 15 points per games is (5 x 4) = 20%.

1c.¶

Using only the histogram above, can we find the percentage of players that scored beteween 12 and 15 points per game? Why or why not?

SOLUTION: Using only the histogram above, we cannot find the percentage of players that scored between 12 and 15 points per game. This is because the bins have regular widths of 5; have no way to determine how many players are in the [12, 15) interval out of the [10, 15) bin using only the histogram above.

Functions¶

Functions are used to define a name that refers to blocks of code. Every function is composed of a function name, arugments/parameters in parentheses, and a body, which often contains a return statement.

def hello(x): """Returns a greeting for the input""" return "Hello " + x

In this function above, the function name is hello, which takes in a single parameter x, and the body returns the string "Hello x", with the x replaced by whatever the value of x is. The parameters/arguments can have any names because they serve as placeholders for values passed in by the user.

When defining functions, it's important to appropriately name your function and make sure your code is readable. Often times, you will see a docstring, which is a description of what the function does. Docstrings are written in triple quotation marks directly under the function name and above the body, as shown in the example above.

Often times, you will want to perform a set of operations on every value of a column. An easy way to do this is by defining a function that performs the set of operations on a general input, and using that function with the Table.apply method.

The syntax for this method is table_name.apply(function_name, 'column_label(s)'). Note that unlike many of the other the other Table methods, the Table.apply method does not take a column label as the first argument. The Table.apply method returns an array containing the ouputs of calling the specified function on each element of the specified column(s).

2a.¶

There are 82 games per season in the NBA. Let's find out what proportion of games each player played in during the 2016-17 season. Define a function named proportion_of_games that takes in one parameter and returns the value of the parameter divided by the number of games in a NBA season.

# SOLUTION
def proportion_of_games(x):
    return x / 82

2b.¶

Using the function we've just made, construct a table that is the same as the nba table but with an additional column labeled 'Proportion of Season Played', which describes the proportion of games that each player played in.

# SOLUTION
# First apply the function to the 'G' column, which represents the number of games played last season
# This will give us an array of the proportion of games played last season for each player
# We use the Table.with_column table to attach these values to the table as a new column 
proportions = nba.apply(proportion_of_games, 'G')
with_prop_played = nba.with_column('Proportion of Season Played', proportions)
with_prop_played

Most accurate team: GSW

Highest Average FG%: SF, Highest average PPG: SF

Which players were at every game? Assign all_games to a table of players that participated in every game of the season.

# SOLUTION
# A valid way to find the players that participated in every game of the season is to use .where('G', 82)
# but we were really looking for you to use the proportions calculated above, where a proportion of 1
# corresponds to all 82 games played out of the 82 games of the regular season.
all_games = with_prop_played.where('Proportion of Season Played', 1)
all_games

Most accurate team: GSW

Highest Average FG%: SF, Highest average PPG: SF

2c.¶

Players with common first and last names can form "combined names." For example, LeBron James and James Harden have the common name "James," and can therefore combine to form LeBron James Harden. We'll define combined names to have a prefix player and a suffix player; in the example, LeBron James is the prefix player and James Harden is the suffix player. Let's find the combined names in our nba table.

To find the common names in our table, we first need to isolate the first and last names of each player. Some players may have more than two terms in their full name, so we'll define first names to be the first term in their full name, and last names to be the remaining terms in their full name. Fill out the first_name and last_name functions below, which should take in a name as a string and return the first name or last name respectively.

Hint: For a string str that has words separated by spaces, we can retrieve the first word with the expression str.split(" ")[0], and we can retrieve the remaining words in the string using the expression " ".join(str.split(" ")[1:]). Don't worry, you're not expected to know how these expressions work.

# SOLUTION
# The parameter name is just a variable/placeholder, so we need to use the same parameter names to apply
# the methods given in the hint
def firstname(name):
    """Returns the first name of an input full name string"""
    return name.split(" ")[0]

def lastname(name):
    """Returns the last name of an input full name string"""
    return " ".join(name.split(" ")[1:])

Now that we have functions to do the heavy lifting for us, assign first_names to an array containing the first names of all the players in the nba table. Similarly, assign last_names to an array containing the last names of all the players in the nba table.

Hint: Can you think of a way to apply the functions we just made on the player names?

# SOLUTION
# The hint indicates we want to apply our functions on the 'Player' columns to get the first and last names
# We can do this using the Table.apply method
first_names = nba.apply(firstname, 'Player')
last_names = nba.apply(lastname, 'Player')

Let's organize the information we have so far. Create a names table that contains the full names of the players, the first names of the players, and the last names of the players. Make sure to label the first name column as first and the last name column as last.

# SOLUTION
# Some students may use the Table.with_column method twice, which is valid, but it is a bit easier to use the 
# Table.with_columns method when adding multiple columns, as shown in lecture
# Additionally, note that selecting columns before adding new columns is more efficient than 
# adding columns and selecting/dropping columns afterwards
names = nba.select('Player').with_columns('first', first_names,
                                         'last', last_names)
names

Most accurate team: GSW

Highest Average FG%: SF, Highest average PPG: SF

Now that we've separated each player's name into a first name and a last name, we can determine our prefix and suffix players. A prefix player is a player whose last name matches the first name of any other player in the table. Similarly, a suffix player is a player whose first name matches the last name of any other player in the table. Assign prefixes to a table containing prefix players, and assign suffixes to a table containing suffix players.

Hint: We've already made an array of all the first names and an array of all the last names in the table.

# SOLUTION
# The hint indicates that we need to check which players' first names exist in the array of last names, 
# and which players' last names exist in the array of first names.
# We can check if a value is in an array using the are.contained_in predicate when we use the Table.where method
# to keep only the rows of prefix or suffix players in our respective tables
prefixes = names.where('last', are.contained_in(first_names))
suffixes = names.where('first', are.contained_in(last_names))

Now that we know our prefix players and suffix players, we can match them up based on their common names. Ideally, we'd have you use the Table.join method to achieve this on your own, but we'll cover joins in more detail next week. For now, just run the cell below, which joins by prefix name in prefix_joined and joins by suffix name in suffix_name.

# We haven't covered joins in detail yet, just run this
prefix_joined = prefixes.join('last', suffixes, other_label='first').select(2, 1, 0, 3, 4)
suffix_joined = suffixes.join('first', prefixes, other_label='last').select(4, 3, 0, 1, 2)

# You don't need to understand the body of this function, just know how to use it
def name_joiner(full, partial):
    """
    Combines an array of full names with their corresponding partial names to return an list of combined names.

    Note: the first parameter of this function must be a full name
    combined name = full + partial
    """
    return [pre + " " + suff for pre, suff in zip(full, partial)]

prefix_joined

Most accurate team: GSW

Highest Average FG%: SF, Highest average PPG: SF

suffix_joined

Most accurate team: GSW

Highest Average FG%: SF, Highest average PPG: SF

You'll notice that the prefix_joined and suffix_joined tables contain full names and partial names of players. We want to combine the player names by the common name in the middle column. To do this, we can use the name_joiner function provided in the previous cell, which takes in an array of full names and an array of their corresponding partial names, and combines them to return a list of combined names. Note that the first parameter of the function must be a full name, which corresponds to the left common terms of a combined name.

The prefix_joined and suffix_joined tables have already been reorganized to make it easier for you to identify which columns to use with the name_joiner function. Assign pref_combined to the list of names that result from combining names in the prefix_joined table, and assign suff_combined to the list of names that result from combining names in the suffix_joined table.

# SOLUTION
# In the prefix_joined table, the full name is in the 'Player' column and the partial name to combine 
# is in the 'last_2' column
# In the suffix_joined table, the full name is in the 'Player_2' column and the partial name to combine
# is in the 'last' column
# We pass in these columns as arrays in the appropriate order into the 'name_joiner' function
pref_combined = name_joiner(prefix_joined.column('Player'), prefix_joined.column('last_2'))
suff_combined = name_joiner(suffix_joined.column('Player_2'), suffix_joined.column('last'))

All that's left to do is compile our combined names into a single table. We can sort the names in alphabetical order and remove duplicate values using the Table.sort method. Run the cell below to see the combined names!

combined_names = Table().with_column('Combined Names', pref_combined + suff_combined).sort('Combined Names', distinct=True)
combined_names # Use combined_names.show() to see all of the rows

Most accurate team: GSW

Highest Average FG%: SF, Highest average PPG: SF

Groups and Pivots¶

We use the Table.group method to aggregate rows with similar values for a column into single rows for those values. Grouping is often useful if you want to organize your data by a specific column or variable.

The syntax for the method is Table.group(label_to_group_by, collect=aggregateFunction). The first argument of the method is the column label to group by. If grouping by multiple columns, the first argument should be an array of column labels to group by. The second parameter is a function that aggregates the grouped values into a single value. The second parameter is optional, and uses the len function by default that returns a count of the number of grouped values. Note that the second parameter is only applied on values in the columns not specified in the first parameter.

Pivots are useful for cross-classifying your data by two categorical variables. Using the Table.pivot method yields a grid of aggregated values after grouping together rows with shared values. The syntax of the method is Table.pivot('col_label_one', 'col_label_two', values='AGG_VALUES_LABEL', collect=AGG_FUNCTION)

The Table.pivot('col_label_one', 'col_label_two') method requires 2 parameters. The first parameter will form the column labels of the resulting pivot table, and the second parameter will form the row labels of the table. By default, the method fills in the counts of values that correspond to the first and second category. It may be helpful to imagine the table as a grid where every value's x-coordinate is a category from the first parameter, and every value's y-coordinate is a category from the second parameter.

If you wish to aggregate custom values using a different function, you can use the two optional arguments values='AGG_VALUES_LABEL', collect=AGG_FUNCTION. When using pivot, you must include either both of these optional arguments, or neither of them. The values argument should be the label of the column you'd like to supply as values instead of counts. The collect function aggregates the grouped values into a single value, similar to the collect function in the Table.group method.

3a.¶

Create a table avg_team_age that contains teams and the average age of each team's players.

Hint: It may be easier to work with just the 'Tm' and 'Age' columns instead of the entire table.

# SOLUTION
# As per the hint, we only need the 'Tm' and 'Age' columns, so our computation may be faster if we exclude
# unneccessary columns before grouping
# We can then use the Table.group method to aggregate the values in the table by team. We pass in the optional
# `collect` argument as np.average to combine all of the ages per team using the np.average function. This gives us
# the average age per team. If we wanted to find the total age per team, we could pass in the `np.sum` or `sum`
# as the collect function.
avg_team_age = nba.select('Tm', 'Age').group('Tm', collect=np.average)
avg_team_age

Most accurate team: GSW

Highest Average FG%: SF, Highest average PPG: SF

Which team had the youngest players on average? Which team had the oldest players on average?

SOLUTION: We can sort the avg_team_age table to answer these questions. The Portland Trailblazers had the youngest players on average. The Cleveland Cavaliers had the oldest players on average.

3b.¶

Create a histogram that compares average field goal percentage by team. Set the most_accurate variable to the abbreviation of the most accurate team.

# SOLUTION
# Similar to the previous problem, we group the table by the 'Tm' column and aggregate using the `np.average` function
# Note that because we did not select or drop any specific columns, all columns of the table with numerical values
# will be aggregated into an average by team
# For instance, if you look at the `team_avg` table, you'll notice that column's like 'Player' will be blank because
# we can't take an average of strings or names. When the `collect` function does not work on a column's type, the
# aggregated value for that column will be empty
# Once we've grouped by Team, we can sort by 'FG%' to get the most accurate teams, then use the .row and .item methods
# to get the team abbreviation of the team with the highest field goal percentage
team_avg = nba.group('Tm', np.average)
most_accurate = team_avg.sort('FG% average', descending=True).row(0).item('Tm')
print('Most accurate team:', most_accurate)
team_avg.barh('Tm', 'FG% average')

Most accurate team: GSW

Let's further examine the most_accurate team. Which position on the most_accurate team had the highest average field goal percentage? Which position on the most_accurate team had the highest average points per game? Are these the same positions?

# SOLUTION
# We first find the players that were on the most accurate team (in the `most_accurate` table) and
# group them by position using the `np.average` function to aggregate the values into averages for each column
# We can then sort by our desired column and use the .row and .item methods to access the positions that 
# had the highest average field goal percentage and the highest average points per game on the most 
# accurate team
most_accurate_positions = nba.where('Tm', most_accurate).group('Pos', np.average)
highest_avg_fg = most_accurate_positions.sort('FG% average', descending=True).row(0).item('Pos')
highest_avg_ppg = most_accurate_positions.sort('PPG average', descending=True).row(0).item('Pos')
print("Highest Average FG%:", highest_avg_fg + ",", "Highest average PPG:", highest_avg_ppg)

Highest Average FG%: SF, Highest average PPG: SF

SOLUTION: The most accurate team was the Golden State Warriors. The Small Forward position had the highest average field goal percentage on this team. The Small Forward position also had the highest average points scored per game.

3c.¶

Create a table that describes how many players of each position are on each team. Your first column should be the team's abbreviation, and the remaining columns should be number of players for the different positions on the team. Be sure to first exclude the TOT team, so that any players that were on multiple teams during the season are left out of the table.

# SOLUTION
# Notice that we want to group by two categorical variables: team and position
# We can use the `Table.pivot` to do this
# First we exclude the players that were listed as being on the "TOT" team
# We then use the `Table.pivot` method with the positions column as the first argument and the team abbreviation
# column as the second argument, as specified in the problem. We don't need to pass in the optional `values` and 
# `collect` parameters because we only want the counts of the players
nba.where('Tm', are.not_equal_to('TOT')).pivot('Pos', 'Tm')

Now instead of the counts, find the average points per game of each position for every team. The format of your table should be the same as the table above. Remember to first exclude the TOT team.

# SOLUTION
# This is very similar to the previous problem, except instead of counts, we specifically want aggregates of 
# the points per game values. We can do this by supplying the points per game column in the `values` optional 
# parameter and the `np.average` function for the `collect` parameter to aggregate the values into an average.
nba.where('Tm', are.not_equal_to('TOT')).pivot('Pos', 'Tm', values='PPG', collect=np.average)

Rk	Player	Pos	Age	Tm	G	GS	MP	FG	FGA	FG%	3P	3PA	3P%	2P	2PA	2P%	FT	FTA	FT%	TRB	AST	STL	BLK	TOV	PF	PPG
1	Alex Abrines	SG	23	OKC	68	6	15.5	2	5	0.393	1.4	3.6	0.381	0.6	1.4	0.426	0.6	0.7	0.898	1.3	0.6	0.5	0.1	0.5	1.7	6
2	Quincy Acy	PF	26	TOT	38	1	14.7	1.8	4.5	0.412	1	2.4	0.411	0.9	2.1	0.413	1.2	1.6	0.75	3	0.5	0.4	0.4	0.6	1.8	5.8
3	Steven Adams	C	23	OKC	80	80	29.9	4.7	8.2	0.571	0	0	0	4.7	8.2	0.572	2	3.2	0.611	7.7	1.1	1.1	1	1.8	2.4	11.3
4	Arron Afflalo	SG	31	SAC	61	45	25.9	3	6.9	0.44	1	2.5	0.411	2	4.4	0.457	1.4	1.5	0.892	2	1.3	0.3	0.1	0.7	1.7	8.4
5	Alexis Ajinca	C	28	NOP	39	15	15	2.3	4.6	0.5	0	0.1	0	2.3	4.5	0.511	0.7	1	0.725	4.5	0.3	0.5	0.6	0.8	2	5.3
6	Cole Aldrich	C	28	MIN	62	0	8.6	0.7	1.4	0.523	0	0	nan	0.7	1.4	0.523	0.2	0.4	0.682	2.5	0.4	0.4	0.4	0.3	1.4	1.7
7	LaMarcus Aldridge	PF	31	SAS	72	72	32.4	6.9	14.6	0.477	0.3	0.8	0.411	6.6	13.8	0.48	3.1	3.8	0.812	7.3	1.9	0.6	1.2	1.4	2.2	17.3
8	Lavoy Allen	PF	27	IND	61	5	14.3	1.3	2.8	0.458	0	0	0	1.3	2.7	0.461	0.4	0.5	0.697	3.6	0.9	0.3	0.4	0.5	1.3	2.9
9	Tony Allen	SG	35	MEM	71	66	27	3.9	8.4	0.461	0.2	0.8	0.278	3.6	7.6	0.479	1.1	1.8	0.615	5.5	1.4	1.6	0.4	1.4	2.5	9.1
10	Al-Farouq Aminu	SF	26	POR	61	25	29.1	3	7.6	0.393	1.1	3.5	0.33	1.9	4.2	0.445	1.6	2.2	0.706	7.4	1.6	1	0.7	1.5	1.7	8.7

Rk	Player	Pos	Age	Tm	G	GS	MP	FG	FGA	FG%	3P	3PA	3P%	2P	2PA	2P%	FT	FTA	FT%	TRB	AST	STL	BLK	TOV	PF	PPG	Proportion of Season Played
1	Alex Abrines	SG	23	OKC	68	6	15.5	2	5	0.393	1.4	3.6	0.381	0.6	1.4	0.426	0.6	0.7	0.898	1.3	0.6	0.5	0.1	0.5	1.7	6	0.829268
2	Quincy Acy	PF	26	TOT	38	1	14.7	1.8	4.5	0.412	1	2.4	0.411	0.9	2.1	0.413	1.2	1.6	0.75	3	0.5	0.4	0.4	0.6	1.8	5.8	0.463415
3	Steven Adams	C	23	OKC	80	80	29.9	4.7	8.2	0.571	0	0	0	4.7	8.2	0.572	2	3.2	0.611	7.7	1.1	1.1	1	1.8	2.4	11.3	0.97561
4	Arron Afflalo	SG	31	SAC	61	45	25.9	3	6.9	0.44	1	2.5	0.411	2	4.4	0.457	1.4	1.5	0.892	2	1.3	0.3	0.1	0.7	1.7	8.4	0.743902
5	Alexis Ajinca	C	28	NOP	39	15	15	2.3	4.6	0.5	0	0.1	0	2.3	4.5	0.511	0.7	1	0.725	4.5	0.3	0.5	0.6	0.8	2	5.3	0.47561
6	Cole Aldrich	C	28	MIN	62	0	8.6	0.7	1.4	0.523	0	0	nan	0.7	1.4	0.523	0.2	0.4	0.682	2.5	0.4	0.4	0.4	0.3	1.4	1.7	0.756098
7	LaMarcus Aldridge	PF	31	SAS	72	72	32.4	6.9	14.6	0.477	0.3	0.8	0.411	6.6	13.8	0.48	3.1	3.8	0.812	7.3	1.9	0.6	1.2	1.4	2.2	17.3	0.878049
8	Lavoy Allen	PF	27	IND	61	5	14.3	1.3	2.8	0.458	0	0	0	1.3	2.7	0.461	0.4	0.5	0.697	3.6	0.9	0.3	0.4	0.5	1.3	2.9	0.743902
9	Tony Allen	SG	35	MEM	71	66	27	3.9	8.4	0.461	0.2	0.8	0.278	3.6	7.6	0.479	1.1	1.8	0.615	5.5	1.4	1.6	0.4	1.4	2.5	9.1	0.865854
10	Al-Farouq Aminu	SF	26	POR	61	25	29.1	3	7.6	0.393	1.1	3.5	0.33	1.9	4.2	0.445	1.6	2.2	0.706	7.4	1.6	1	0.7	1.5	1.7	8.7	0.743902

Rk	Player	Pos	Age	Tm	G	GS	MP	FG	FGA	FG%	3P	3PA	3P%	2P	2PA	2P%	FT	FTA	FT%	TRB	AST	STL	BLK	TOV	PF	PPG	Proportion of Season Played
56	Corey Brewer	SF	30	TOT	82	11	15.6	1.8	4.2	0.422	0.3	1.4	0.229	1.5	2.8	0.522	0.6	0.8	0.735	2	1.2	0.7	0.2	0.7	1.6	4.5	1
79	Marquese Chriss	PF	19	PHO	82	75	21.3	3.5	7.7	0.449	0.9	2.7	0.321	2.6	5	0.52	1.4	2.2	0.624	4.2	0.7	0.8	0.9	1.3	3.2	9.2	1
83	Jordan Clarkson	SG	24	LAL	82	19	29.2	5.8	13.1	0.445	1.4	4.3	0.329	4.4	8.7	0.503	1.6	2	0.798	3	2.6	1.1	0.1	2	1.8	14.7	1
93	Jamal Crawford	SG	36	LAC	82	1	26.3	4.4	10.6	0.413	1.4	3.9	0.36	3	6.7	0.443	2.1	2.5	0.857	1.6	2.6	0.7	0.2	1.6	1.4	12.3	1
111	Gorgui Dieng	PF	27	MIN	82	82	32.4	4	8.1	0.502	0.2	0.5	0.372	3.9	7.5	0.511	1.7	2	0.814	7.9	1.9	1.1	1.2	1.3	3.1	10	1
159	Marcin Gortat	C	32	WAS	82	82	31.2	4.8	8.2	0.579	0	0	0	4.8	8.2	0.58	1.3	1.9	0.648	10.4	1.5	0.5	0.7	1.5	2.6	10.8	1
180	Tobias Harris	PF	24	DET	82	48	31.3	6.2	13	0.481	1.3	3.8	0.347	4.9	9.1	0.537	2.3	2.8	0.841	5.1	1.7	0.7	0.5	1.2	1.6	16.1	1
192	Buddy Hield	SG	23	TOT	82	55	23	4	9.4	0.426	1.8	4.6	0.391	2.2	4.7	0.461	0.8	0.9	0.842	3.3	1.5	0.5	0.1	1.2	1.4	10.6	1
199	Justin Holiday	SG	27	NYK	82	4	20	2.8	6.6	0.433	1.2	3.3	0.355	1.7	3.2	0.513	0.8	1	0.825	2.7	1.2	0.8	0.4	0.8	1.3	7.7	1
212	Ersan Ilyasova	PF	29	TOT	82	52	26.1	4.7	10.9	0.431	1.7	4.9	0.353	3	6	0.496	2	2.5	0.778	5.9	1.7	0.7	0.3	1.4	2.6	13.1	1

Player	first	last
Alex Abrines	Alex	Abrines
Quincy Acy	Quincy	Acy
Steven Adams	Steven	Adams
Arron Afflalo	Arron	Afflalo
Alexis Ajinca	Alexis	Ajinca
Cole Aldrich	Cole	Aldrich
LaMarcus Aldridge	LaMarcus	Aldridge
Lavoy Allen	Lavoy	Allen
Tony Allen	Tony	Allen
Al-Farouq Aminu	Al-Farouq	Aminu

first	Player	last	Player_2	last_2
Lavoy	Lavoy Allen	Allen	Allen Crabbe	Crabbe
Tony	Tony Allen	Allen	Allen Crabbe	Crabbe
Alan	Alan Anderson	Anderson	Anderson Varejao	Varejao
Justin	Justin Anderson	Anderson	Anderson Varejao	Varejao
Kyle	Kyle Anderson	Anderson	Anderson Varejao	Varejao
Ryan	Ryan Anderson	Anderson	Anderson Varejao	Varejao
Carmelo	Carmelo Anthony	Anthony	Anthony Bennett	Bennett
Joel	Joel Anthony	Anthony	Anthony Bennett	Bennett
Avery	Avery Bradley	Bradley	Bradley Beal	Beal
Tyson	Tyson Chandler	Chandler	Chandler Parsons	Parsons

Tm	Age average
ATL	27
BOS	25.2667
BRK	26.6429
CHI	25.5385
CHO	25.5
CLE	29.9286
DAL	27.0625
DEN	25.5333
DET	25.4667
GSW	27.6

Tm	C	PF_2	PG	SF	SG
ATL	2	4	2	4	3
BOS	3	3	3	3	3
BRK	2	3	4	2	3
CHI	2	2	3	2	4
CHO	2	3	3	1	5
CLE	4	1	2	4	3
DAL	3	5	5	1	2
DEN	2	3	2	4	4
DET	3	3	3	3	3
GSW	6	2	2	2	3

Tm	C	PF_2	PG	SF	SG
ATL	9.85	6.7	11.65	6.65	6.1
BOS	8.83333	3.93333	12.1333	8.7	9.73333
BRK	13.7	6.7	7.875	8.45	8.83333
CHI	7.6	8.7	6.76667	14.7	7
CHO	6.35	8.06667	10.9667	9.2	7.52
CLE	5.075	19	14.6	10.975	6.83333
DAL	3.93333	8.42	8.2	2.8	7.75
DEN	9.1	6.96667	10.1	9.025	10.575
DET	8	9.83333	9.9	7.63333	5.83333
GSW	3.75	6.5	15.2	16.35	11.0333

Combined Names
Aaron Gordon Hayward
Aaron Harrison Barnes
Alan Anderson Varejao
Andrew Harrison Barnes
Anthony Davis Bertans
Avery Bradley Beal
C.J. Miles Plumlee
Carmelo Anthony Bennett
Carmelo Anthony Brown
Carmelo Anthony Davis

Tm	C	PF_2	PG	SF	SG
ATL	2	4	2	4	3
BOS	3	3	3	3	3
BRK	2	3	4	2	3
CHI	2	2	3	2	4
CHO	2	3	3	1	5
CLE	4	1	2	4	3
DAL	3	5	5	1	2
DEN	2	3	2	4	4
DET	3	3	3	3	3
GSW	6	2	2	2	3

Tm	C	PF_2	PG	SF	SG
ATL	2	4	2	4	3
BOS	3	3	3	3	3
BRK	2	3	4	2	3
CHI	2	2	3	2	4
CHO	2	3	3	1	5
CLE	4	1	2	4	3
DAL	3	5	5	1	2
DEN	2	3	2	4	4
DET	3	3	3	3	3
GSW	6	2	2	2	3