python - How to add a repeated column using pandas -


i doing homework , encounter problem, have large matrix, first column y002 nominal variable, has 3 levels , encoded 1,2,3 respectively. other 2 columns v96 , v97 numeric.

now, wanna group mean corresponds variable y002. wrote code

group = data2.groupby(by=["y002"]).mean()

then index each group mean using

group1 = group["v96"]

group2 = group["v97"] 

now wanna append group mean new column original dataframe, in each mean matches corresponding y002 code(1 or 2 or 3). tried code, shows nan.

data2["group1"] = pd.series(group1, index=data2.index) 

hope me this, many :)

ps: hope makes sense. r language, can same thing using

data2$group1 = with(data2, tapply(v97,y002,mean))[data2$y002]

but how can implement in python , pandas???

you can use .transform()

import pandas pd import numpy np  # data # ============================ np.random.seed(0) df = pd.dataframe({'y002': np.random.randint(1,4,100), 'v96': np.random.randn(100), 'v97': np.random.randn(100)}) print(df)          v96     v97  y002 0  -0.6866 -0.1478     1 1   0.0149  1.6838     2 2  -0.3757  0.9718     1 3  -0.0382  1.6077     2 4   0.3680 -0.2571     2 5  -0.0447  1.8098     3 6  -0.3024  0.8923     1 7  -2.2244 -0.0966     3 8   0.7240 -0.3772     1 9   0.3590 -0.5053     1 ..     ...     ...   ... 90 -0.6906  1.5567     2 91 -0.6815 -0.4189     3 92 -1.5122 -0.4097     1 93  2.1969  1.1164     2 94  1.0412 -0.2510     3 95 -0.0332 -0.4152     1 96  0.0656 -0.6391     3 97  0.2658  2.4978     1 98  1.1518 -3.0051     2 99  0.1380 -0.8740     3  # processing # =========================== df['v96_mean'] = df.groupby('y002')['v96'].transform(np.mean) df['v97_mean'] = df.groupby('y002')['v97'].transform(np.mean) df         v96     v97  y002  v96_mean  v97_mean 0  -0.6866 -0.1478     1   -0.1944    0.0837 1   0.0149  1.6838     2    0.0497   -0.0496 2  -0.3757  0.9718     1   -0.1944    0.0837 3  -0.0382  1.6077     2    0.0497   -0.0496 4   0.3680 -0.2571     2    0.0497   -0.0496 5  -0.0447  1.8098     3    0.0053   -0.0707 6  -0.3024  0.8923     1   -0.1944    0.0837 7  -2.2244 -0.0966     3    0.0053   -0.0707 8   0.7240 -0.3772     1   -0.1944    0.0837 9   0.3590 -0.5053     1   -0.1944    0.0837 ..     ...     ...   ...       ...       ... 90 -0.6906  1.5567     2    0.0497   -0.0496 91 -0.6815 -0.4189     3    0.0053   -0.0707 92 -1.5122 -0.4097     1   -0.1944    0.0837 93  2.1969  1.1164     2    0.0497   -0.0496 94  1.0412 -0.2510     3    0.0053   -0.0707 95 -0.0332 -0.4152     1   -0.1944    0.0837 96  0.0656 -0.6391     3    0.0053   -0.0707 97  0.2658  2.4978     1   -0.1944    0.0837 98  1.1518 -3.0051     2    0.0497   -0.0496 99  0.1380 -0.8740     3    0.0053   -0.0707  [100 rows x 5 columns] 

Comments

Popular posts from this blog

qt - Using float or double for own QML classes -

Create Outlook appointment via C# .Net -

ios - Swift Array Resetting Itself -