pandas-datareader

python - 如何舍入 Pandas 数据框中的日期时间索引？

有一个像这样的pandas数据框:index2018-06-0102:50:00R45.48-2.82018-06-0107:13:00R45.85-2.0...2018-06-0108:37:00R45.87-2.7我想像这样将索引四舍五入到小时:index2018-06-0102:00:00R45.48-2.82018-06-0107:00:00R45.85-2.0...2018-06-0108:00:00R45.87-2.7我正在尝试以下代码:df=df.date_time.apply(lambdax:x.round('H'))但返回一个系列而不是具有修改索引列的数据框

python Pandas 2018 01 code datetime dataframe

python - Pandas +群

数据集包含4列，其中name是child的名字，yearofbirth表示child出生的年份，number表示以该特定名字命名的婴儿的数量。Forexample,entry1reads,intheyear1880,7065girlchildrenwerenamedMary.通过pandas，我试图找出每年哪个名字是最常用的。我的代码df.groupby(['yearofbirth']).agg({'number':'max'}).reset_index()以上代码部分回答了手头的问题。我想要名称和最大数量。最佳答案基于this

python Pandas section yearofbirth noreferrer pandas-groupby data-analysis

python - 通过 np.char.find 比较 pandas 数据帧的两列给出 TypeError : string operation on non-string array

我想比较两个系列的字符串，看看一个是否包含另一个元素。我首先尝试使用apply，但它很慢:cols=['s1','s2']list_of_series=[pd.Series(['one','sdf'],index=cols),pd.Series(['two','xytwo'],index=cols)]df=pd.DataFrame(list_of_series,columns=cols)dfs1s20onesdf1twoxytwodf.apply(lambdarow:row['s1']inrow['s2'],axis=1)0False1Truedtype:bool它似乎适用于以下代码:

string non-string code 39 section python pandas numpy

python - Pandas 到 Excel(合并标题列)

我想将我的df转换为excel工作表，但还想添加一个标题列来对所有列进行分类。用于复制:importpandasaspd#CreateaPandasdataframefromsomedata.df=pd.DataFrame({'Data':[10,20,30,20,15,30,45]})#CreateaPandasExcelwriterusingXlsxWriterastheengine.writer=pd.ExcelWriter('pandas_simple.xlsx',engine='xlsxwriter')#ConvertthedataframetoanXlsxWriterExc

python Pandas 39 code section excel

python - Pandas 日期时间到 unix 时间戳秒

来自pandas.to_datetime的官方文档我们可以说，unit:string,default‘ns’unitofthearg(D,s,ms,us,ns)denotetheunit,whichisanintegerorfloatnumber.Thiswillbebasedofftheorigin.Example,withunit=’ms’andorigin=’unix’(thedefault),thiswouldcalculatethenumberofmillisecondstotheunixepochstart.所以当我这样尝试时，importpandasaspddf=pd.D

python Pandas code datetime 39

python - 当列值匹配时，Pandas Dataframe 从行中替换 Nan

我有数据框，即InputDataframeclasssectionsubmarksschoolcity0IAEng80jghsssalem1IAMat90jghsssalem2IAEng50Nansalem3IIIAEng80gphssNan4IIIAMat45Nansalem5IIIAEng40gphssNan6IIIAEng20gphsssalem7IIIAMat55gphssNan当“class”和“section”列中的值匹配时，我需要替换“school”和“city”中的“Nan”。结果应该是，输入数据框classsectionsubmarksschoolcity0IAEng

Dataframe python salem section gphss python-3.x pandas nan

python - Pandas :生成并绘制平均值

我有一个像这样的Pandas数据框:In[61]:df=DataFrame(np.random.rand(3,4),index=['art','mcf','mesa'],columns=['pol1','pol2','pol3','pol4'])In[62]:dfOut[62]:pol1pol2pol3pol4art0.6615920.4792020.7004510.345085mcf0.2355170.6659810.7787740.610344mesa0.8383960.0356480.4240470.866920我想生成一行，其中包含基准中策略的平均值，然后绘制它。目前，我这样做

python Pandas code 39 pol matplotlib plot

python - Pandas 面板中的 bool 掩码

我在用与DataFrame相同的方式屏蔽面板时遇到了一些问题。我想做的事情感觉很简单，但我还没有找到查看文档和在线论坛的方法。我在下面有一个简单的例子:importpandasimportnumpyasnpimportdatetimestart_date=datetime.datetime(2009,3,1,6,29,59)r=pandas.date_range(start_date,periods=12)cols_1=['AAPL','AAPL','GOOG','GOOG','GS','GS']cols_2=['close','rate','close','rate','close'

python Pandas 2009 nan 03 panel mask

python - Pandas 错误 : 'DataFrame' object has no attribute 'loc'

我是pandas的新手，正在尝试Pandas0.10.1版的Pandas10分钟教程。但是，当我执行以下操作时，出现如下所示的错误。printdf工作正常。为什么.loc不起作用？代码importnumpyasnpimportpandasaspddf=pd.DataFrame(np.random.randn(6,4),index=pd.date_range('20130101',periods=6),columns=['A','B','C','D'])df.loc[:,['A','B']]错误:AttributeErrorTraceback(mostrecentcalllast)in(

amp 39 section code python python-2.7 numpy scipy pandas

python - Pandas :使用 if-else 填充新列

我有一个像这样的DataFrame:col1col210010000332004如果col2>0或0，我想添加一个为1的列，否则为0。如果我使用R，我会做类似的事情df1[,'col3']0,1,0)我如何在python/pandas中执行此操作？最佳答案您可以将bool系列df.col2>0转换为整数系列(True变为1和False变为0):df['col3']=(df.col2>0).astype('int')(要创建一个新列，您只需为其命名并将其分配给与您的DataFrame长度相同的系列、数组或列表。)这产生col3为:

if-else python code section col pandas if-statement dataframe