pandas条件替换值(where&mask)

三叶草body 2023-03-28 原文

pandas条件替换值(where&mask)

在日常分析中，经常会遇到对数据的筛选处理相关的工作，我们可以使用loc和iloc定位分析筛选的列或行数据，下面介绍一种高级筛选的用法where和mask。

pd.where: 替换条件（condition）为Flase处的值

pd.mask: 替换条件（condition）为True处的值

np.where: 替换条件，类似三元表达式

# 条件不成立时，值替换成other
pd.where(self, cond, other=nan, inplace=False,
        axis=None, level=None, errors='raise', try_cast=False)
# 条件成立时，值替换成other
pd.mask(self, cond, other=nan, inplace=False,
		axis=None, level=None, errors='raise', try_cast=False)
# 条件成立时，值为x；不成立时，值为y
np.where(condition, x, y)

首先模拟一组学生成绩表数据：

import pandas as pd
import numpy as np

# 设置学科
subjects = ['math', 'chinese', 'english', 'history']

# 设置学生
students = ['Tom', 'Alice', 'Bobby', 'Candy', 'David', 'Eva', 'Frank', 'Grace', 'Howard', 'Ivy',
            'John', 'Karen', 'Larry', 'Marie', 'Nancy', 'Oscar', 'Peter', 'Queen', 'Robert', 'Susan']

# 随机生成成绩
score = np.random.randint(low=0, high=100, size=(len(students), len(subjects)))

# 生成DataFrame
df = pd.DataFrame(
    score,
    columns=subjects,
    index=students
)

df

	math	chinese	english	history
Tom	24	57	60	44
Alice	92	25	64	26
Bobby	96	61	94	96
Candy	36	87	10	38
David	29	73	37	64
Eva	94	40	30	81
Frank	24	44	40	14
Grace	37	70	50	5
Howard	82	86	46	10
Ivy	24	7	30	30
John	39	32	97	48
Karen	68	29	34	11
Larry	82	5	3	78
Marie	96	83	73	63
Nancy	25	33	37	53
Oscar	2	65	49	73
Peter	9	19	11	67
Queen	44	19	85	23
Robert	75	35	47	77
Susan	71	6	10	82

1 pd.where

where(条件, pd.NA)

值替换：pandas中的where方法，如果条件为真，保持原来的值，否则替换为other

增加字段 math_pass, 数学成绩大于60，为及格，否则为不及格

df1 = df.copy()
# 默认及格
df1['math_pass'] = '及格'
df1['math_pass'] = df1['math_pass'].where(df1['math'] > 60, '不及格')

df1

	math	chinese	english	history	math_pass
Tom	24	57	60	44	不及格
Alice	92	25	64	26	及格
Bobby	96	61	94	96	及格
Candy	36	87	10	38	不及格
David	29	73	37	64	不及格
Eva	94	40	30	81	及格
Frank	24	44	40	14	不及格
Grace	37	70	50	5	不及格
Howard	82	86	46	10	及格
Ivy	24	7	30	30	不及格
John	39	32	97	48	不及格
Karen	68	29	34	11	及格
Larry	82	5	3	78	及格
Marie	96	83	73	63	及格
Nancy	25	33	37	53	不及格
Oscar	2	65	49	73	不及格
Peter	9	19	11	67	不及格
Queen	44	19	85	23	不及格
Robert	75	35	47	77	及格
Susan	71	6	10	82	及格

2 np.where

在numpy中的where使用,与pandas有所不同

# 条件成立时，值为x；不成立时，值为y
np.where(condition, x, y)

增加字段 math_pass2, 数学成绩大于60，为及格，否则为不及格

df2 = df.copy()
# 数学成绩大于60，为及格， 否则为不及格
df2['math_pass2'] = np.where(df2['math'] > 60, '及格', '不及格')

df2

	math	chinese	english	history	math_pass2
Tom	24	57	60	44	不及格
Alice	92	25	64	26	及格
Bobby	96	61	94	96	及格
Candy	36	87	10	38	不及格
David	29	73	37	64	不及格
Eva	94	40	30	81	及格
Frank	24	44	40	14	不及格
Grace	37	70	50	5	不及格
Howard	82	86	46	10	及格
Ivy	24	7	30	30	不及格
John	39	32	97	48	不及格
Karen	68	29	34	11	及格
Larry	82	5	3	78	及格
Marie	96	83	73	63	及格
Nancy	25	33	37	53	不及格
Oscar	2	65	49	73	不及格
Peter	9	19	11	67	不及格
Queen	44	19	85	23	不及格
Robert	75	35	47	77	及格
Susan	71	6	10	82	及格

3 pd.mask

值替换：pandas中的mask方法，如果条件为真，值替换为other

增加字段 math_pass3, 数学成绩大于60，为及格，否则为不及格

df3 = df.copy()
df3['math_pass3'] = '不及格'
df3['math_pass3'] = df3['math_pass3'].mask(df3['math'] > 60, '及格')

df3

	math	chinese	english	history	math_pass3
Tom	24	57	60	44	不及格
Alice	92	25	64	26	及格
Bobby	96	61	94	96	及格
Candy	36	87	10	38	不及格
David	29	73	37	64	不及格
Eva	94	40	30	81	及格
Frank	24	44	40	14	不及格
Grace	37	70	50	5	不及格
Howard	82	86	46	10	及格
Ivy	24	7	30	30	不及格
John	39	32	97	48	不及格
Karen	68	29	34	11	及格
Larry	82	5	3	78	及格
Marie	96	83	73	63	及格
Nancy	25	33	37	53	不及格
Oscar	2	65	49	73	不及格
Peter	9	19	11	67	不及格
Queen	44	19	85	23	不及格
Robert	75	35	47	77	及格
Susan	71	6	10	82	及格

有关pandas条件替换值(where&mask)的更多相关文章

ruby-on-rails - rails : "missing partial" when calling 'render' in RSpec test - 2
我正在尝试测试是否存在表单。我是Rails新手。我的new.html.erb_spec.rb文件的内容是:require'spec_helper'describe"messages/new.html.erb"doit"shouldrendertheform"dorender'/messages/new.html.erb'reponse.shouldhave_form_putting_to(@message)with_submit_buttonendendView本身，new.html.erb，有代码:当我运行rspec时，它失败了:1)messages/new.html.erbshou
ruby-on-rails - 由于 "wkhtmltopdf"，PDFKIT 显然无法正常工作 - 2
我在从html页面生成PDF时遇到问题。我正在使用PDFkit。在安装它的过程中，我注意到我需要wkhtmltopdf。所以我也安装了它。我做了PDFkit的文档所说的一切......现在我在尝试加载PDF时遇到了这个错误。这里是错误:commandfailed:"/usr/local/bin/wkhtmltopdf""--margin-right""0.75in""--page-size""Letter""--margin-top""0.75in""--margin-bottom""0.75in""--encoding""UTF-8""--margin-left""0.75in""-
ruby-on-rails - 'compass watch' 是如何工作的/它是如何与 rails 一起使用的 - 2
我在我的项目目录中完成了compasscreate.和compassinitrails。几个问题:我已将我的.sass文件放在public/stylesheets中。这是放置它们的正确位置吗？当我运行compasswatch时，它不会自动编译这些.sass文件。我必须手动指定文件:compasswatchpublic/stylesheets/myfile.sass等。如何让它自动运行？文件ie.css、print.css和screen.css已放在stylesheets/compiled。如何在编译后不让它们重新出现的情况下删除它们？我自己编译的.sass文件编译成compiled/t
ruby-on-rails - 如何从 format.xml 中删除 <hash></hash> - 2
我有一个对象has_many应呈现为xml的子对象。这不是问题。我的问题是我创建了一个Hash包含此数据，就像解析器需要它一样。但是rails自动将整个文件包含在.........我需要摆脱type="array"和我该如何处理？我没有在文档中找到任何内容。最佳答案我遇到了同样的问题；这是我的XML:我在用这个:entries.to_xml将散列数据转换为XML，但这会将条目的数据包装到中所以我修改了:entries.to_xml(root:"Contacts")但这仍然将转换后的XML包装在“联系人”中，将我的XML代码修改为
ruby - 检查 "command"的输出应该包含 NilClass 的意外崩溃 - 2
为了将Cucumber用于命令行脚本，我按照提供的说明安装了arubagem。它在我的Gemfile中，我可以验证是否安装了正确的版本并且我已经包含了require'aruba/cucumber'在'features/env.rb'中为了确保它能正常工作，我写了以下场景:@announceScenario:Testingcucumber/arubaGivenablankslateThentheoutputfrom"ls-la"shouldcontain"drw"假设事情应该失败。它确实失败了，但失败的原因是错误的:@announceScenario:Testingcucumber/ar
ruby-on-rails - Rails 3.2.1 中 ActionMailer 中的未定义方法 'default_content_type=' - 2
我在我的项目中添加了一个系统来重置用户密码并通过电子邮件将密码发送给他，以防他忘记密码。昨天它运行良好(当我实现它时)。当我今天尝试启动服务器时，出现以下错误。=>BootingWEBrick=>Rails3.2.1applicationstartingindevelopmentonhttp://0.0.0.0:3000=>Callwith-dtodetach=>Ctrl-CtoshutdownserverExiting/Users/vinayshenoy/.rvm/gems/ruby-1.9.3-p0/gems/actionmailer-3.2.1/lib/action_mailer
ruby-on-rails - 如何优雅地重启 thin + nginx？ - 2
我的瘦服务器配置了nginx，我的ROR应用程序正在它们上运行。在我发布代码更新时运行thinrestart会给我的应用程序带来一些停机时间。我试图弄清楚如何优雅地重启正在运行的Thin实例，但找不到好的解决方案。有没有人能做到这一点？最佳答案 #Restartjustthethinserverdescribedbythatconfigsudothin-C/etc/thin/mysite.ymlrestartNginx将继续运行并代理请求。如果您将Nginx设置为使用多个上游服务器，例如server{listen80;server
ruby - 在 jRuby 中使用 'fork' 生成进程的替代方案？ - 2
在MRIRuby中我可以这样做:deftransferinternal_server=self.init_serverpid=forkdointernal_server.runend#Maketheserverprocessrunindependently.Process.detach(pid)internal_client=self.init_client#Dootherstuffwithconnectingtointernal_server...internal_client.post('somedata')ensure#KillserverProcess.kill('KILL',
ruby 正则表达式 - 如何替换字符串中匹配项的第 n 个实例 - 2
在我的应用程序中，我需要能够找到所有数字子字符串，然后扫描每个子字符串，找到第一个匹配范围(例如5到15之间)的子字符串，并将该实例替换为另一个字符串“X”。我的测试字符串s="1foo100bar10gee1"我的初始模式是1个或多个数字的任何字符串，例如，re=Regexp.new(/\d+/)matches=s.scan(re)给出["1","100","10","1"]如果我想用“X”替换第N个匹配项，并且只替换第N个匹配项，我该怎么做？例如，如果我想替换第三个匹配项“10”(匹配项[2])，我不能只说s[matches[2]]="X"因为它做了两次替换“1fooX0barXg
ruby - 主要 :Object when running build from sublime 的未定义方法 `require_relative' - 2
我已经从我的命令行中获得了一切，所以我可以运行rubymyfile并且它可以正常工作。但是当我尝试从sublime中运行它时，我得到了undefinedmethod`require_relative'formain:Object有人知道我的sublime设置中缺少什么吗？我正在使用OSX并安装了rvm。最佳答案或者，您可以只使用“require”，它应该可以正常工作。我认为“require_relative”仅适用于ruby1.9+ 关于ruby-主要:Objectwhenrun

pandas条件替换值(where&mask)

pandas条件替换值(where&mask)

1 pd.where

2 np.where

3 pd.mask

有关pandas条件替换值(where&mask)的更多相关文章

随机推荐