我需要能够基于时间序列数据具有置信带的线性回归模型创建用于预测的python函数:
该函数需要使用一个参数来指定要预测的距离。例如1天,7天,30天,90天等。根据参数,将需要创建带有置信带的Holt-Winters预测:
我的时间序列数据如下所示:
print series
[{"target": "average", "datapoints": [[null, 1435688679], [34.870499801635745, 1435688694], [null, 1435688709], [null, 1435688724], [null, 1435688739], [null, 1435688754], [null, 1435688769], [null, 1435688784], [null, 1435688799], [null, 1435688814], [null, 1435688829], [null, 1435688844], [null, 1435688859], [null, 1435688874], [null, 1435688889], [null, 1435688904], [null, 1435688919], [null, 1435688934], [null, 1435688949], [null, 1435688964], [null, 1435688979], [38.180000209808348, 1435688994], [null, 1435689009], [null, 1435689024], [null, 1435689039], [null, 1435689054], [null, 1435689069], [null, 1435689084], [null, 1435689099], [null, 1435689114], [null, 1435689129], [null, 1435689144], [null, 1435689159], [null, 1435689174], [null, 1435689189], [null, 1435689204], [null, 1435689219], [null, 1435689234], [null, 1435689249], [null, 1435689264], [null, 1435689279], [30.79849989414215, 1435689294], [null, 1435689309], [null, 1435689324], [null, 1435689339], [null, 1435689354], [null, 1435689369], [null, 1435689384], [null, 1435689399], [null, 1435689414], [null, 1435689429], [null, 1435689444], [null, 1435689459], [null, 1435689474], [null, 1435689489], [null, 1435689504], [null, 1435689519], [null, 1435689534], [null, 1435689549], [null, 1435689564]]}]
[{"target": "average", "datapoints": [[null, 1435688679], [34.870499801635745, 1435688694], [null, 1435688709], [null, 1435688724], [null, 1435688739], [null, 1435688754], [null, 1435688769], [null, 1435688784], [null, 1435688799], [null, 1435688814], [null, 1435688829], [null, 1435688844], [null, 1435688859], [null, 1435688874], [null, 1435688889], [null, 1435688904], [null, 1435688919], [null, 1435688934], [null, 1435688949], [null, 1435688964], [null, 1435688979], [38.180000209808348, 1435688994], [null, 1435689009], [null, 1435689024], [null, 1435689039], [null, 1435689054], [null, 1435689069], [null, 1435689084], [null, 1435689099], [null, 1435689114], [null, 1435689129], [null, 1435689144], [null, 1435689159], [null, 1435689174], [null, 1435689189], [null, 1435689204], [null, 1435689219], [null, 1435689234], [null, 1435689249], [null, 1435689264], [null, 1435689279], [30.79849989414215, 1435689294], [null, 1435689309], [null, 1435689324], [null, 1435689339], [null, 1435689354], [null, 1435689369], [null, 1435689384], [null, 1435689399], [null, 1435689414], [null, 1435689429], [null, 1435689444], [null, 1435689459], [null, 1435689474], [null, 1435689489], [null, 1435689504], [null, 1435689519], [null, 1435689534], [null, 1435689549], [null, 1435689564]]},{"target": "Forecast", "datapoints": [[186.77999925613403, 1435520801], [178.95000147819519, 1435521131]]},{"target": "Upper", "datapoints": [[186.77999925613403, 1435520801], [178.95000147819519, 1435521131]]},{"target": "Lower", "datapoints": [[186.77999925613403, 1435520801], [178.95000147819519, 1435521131]]}]
最佳答案
在您的问题文本中,您明确指出您希望
回归输出以及输出的上限和下限
预言。您还提到将Holt-Winters算法用于
特别是预测。
其他答复者建议的软件包很有用,但您可能会注意到
该sklearn LinearRegression不会给您“错误范围”
盒子”,statsmodels执行not provide Holt-Winters right now。
因此,我建议尝试使用Holt-Winters的this implementation。
不幸的是,它的许可尚不清楚,因此我无法在此处复制
满的。现在,我不确定您是否真的想要Holt-Winters
(季节性)预测,或Holt的线性指数平滑
算法。我猜后者给了该职位的标题。因此,
您可以使用链接库的linear()函数。这
对于感兴趣的读者,此技术是described in detail here。
为了不提供仅链接的答案-我将描述
这里的主要功能。定义了一个函数来获取数据,即
def linear(x, fc, alpha = None, beta = None):
x是适合的数据,fc是所需的时间步数alpha或beta不是scipy.optimize.fmin_l_bfgs_b 估算 return Y[-fc:], alpha, beta, rmse
Y[-fc:]是预报点,alpha和beta是rmse是均方根误差。It is possible to calculate confidence intervals around long-term forecasts produced by exponential smoothing models, by considering them as special cases of ARIMA models. (Beware: not all software calculates confidence intervals for these models correctly.) The width of the confidence intervals depends on (i) the RMS error of the model, (ii) the type of smoothing (simple or linear); (iii) the value(s) of the smoothing constant(s); and (iv) the number of periods ahead you are forecasting. In general, the intervals spread out faster as α gets larger in the SES model and they spread out much faster when linear rather than simple smoothing is used.
holt()中的forecast包,因此需要这样的间隔。在R code,我们可以看到它返回了一个对象holt。在幕后-这最终要求forecast(ets(...) ,返回平均class1和方差mu(以及我不得不承认我不理解的var)。cj R函数,该函数估计每个函数的方差class1的累积和alpha + k*beta是时间步长的预测数。一旦有了#Copy, import or reimplement the RMSE and linear function from
#https://gist.github.com/andrequeiroz/5888967
#factor in case there are not 1 timestep per day - in your case
#assuming the timesteps are UTC epoch - I think they're 5 min
# spaced i.e. 288 per day
timesteps_per_day = 288
# Note - big assumption here - your known data will be regular in time
# i.e. timesteps_per_day observations per day. From the timestamps this seems valid.
# if you can't guarantee that - you'll need to interpolate the data
def holt_predict(data, timestamps, forecast_days, pred_error_level = 0.95):
forecast_timesteps = forecast_days*timesteps_per_day
middle_predictions, alpha, beta, rmse = linear(data,int(forecast_timesteps))
cum_error = [beta+alpha]
for k in range(1,forecast_timesteps):
cum_error.append(cum_error[k-1] + k*beta + alpha)
cum_error = np.array(cum_error)
#Use some numpy multiplication to get the intervals
var = cum_error * rmse**2
# find the correct ppf on the normal distribution (two-sided)
p = abs(scipy.stats.norm.ppf((1-pred_error_level)/2))
interval = np.sqrt(var) * p
upper = middle_predictions + interval
lower = middle_predictions - interval
fcast_timestamps = [timestamps[-1] + i * 86400 / timesteps_per_day for i in range(forecast_timesteps)]
ret_value = []
ret_value.append({'target':'Forecast','datapoints': zip(middle_predictions, fcast_timestamps)})
ret_value.append({'target':'Upper','datapoints':zip(upper,fcast_timestamps)})
ret_value.append({'target':'Lower','datapoints':zip(lower,fcast_timestamps)})
return ret_value
if __name__=='__main__':
import numpy as np
import scipy.stats
from math import sqrt
null = None
data_in = [{"target": "average", "datapoints": [[null, 1435688679],
[34.870499801635745, 1435688694], [null, 1435688709], [null,
1435688724], [null, 1435688739], [null, 1435688754], [null, 1435688769],
[null, 1435688784], [null, 1435688799], [null, 1435688814], [null,
1435688829], [null, 1435688844], [null, 1435688859], [null, 1435688874],
[null, 1435688889], [null, 1435688904], [null, 1435688919], [null,
1435688934], [null, 1435688949], [null, 1435688964], [null, 1435688979],
[38.180000209808348, 1435688994], [null, 1435689009], [null,
1435689024], [null, 1435689039], [null, 1435689054], [null, 1435689069],
[null, 1435689084], [null, 1435689099], [null, 1435689114], [null,
1435689129], [null, 1435689144], [null, 1435689159], [null, 1435689174],
[null, 1435689189], [null, 1435689204], [null, 1435689219], [null,
1435689234], [null, 1435689249], [null, 1435689264], [null, 1435689279],
[30.79849989414215, 1435689294], [null, 1435689309], [null, 1435689324],
[null, 1435689339], [null, 1435689354], [null, 1435689369], [null,
1435689384], [null, 1435689399], [null, 1435689414], [null, 1435689429],
[null, 1435689444], [null, 1435689459], [null, 1435689474], [null,
1435689489], [null, 1435689504], [null, 1435689519], [null, 1435689534],
[null, 1435689549], [null, 1435689564]]}]
#translate the data. There may be better ways if you're
#prepared to use pandas / input data is proper json
time_series = data_in[0]["datapoints"]
epoch_in = []
Y_observed = []
for (y,x) in time_series:
if y and x:
epoch_in.append(x)
Y_observed.append(y)
#Pass in the number of days to forecast
fcast_days = 30
res = holt_predict(Y_observed,epoch_in,fcast_days)
data_out = data_in + res
#data_out now holds the data as you wanted it.
#Optional plot of results
import matplotlib.pyplot as plt
plt.plot(epoch_in,Y_observed)
m,tstamps = zip(*res[0]['datapoints'])
u,tstamps = zip(*res[1]['datapoints'])
l,tstamps = zip(*res[2]['datapoints'])
plt.plot(tstamps,u, label='upper')
plt.plot(tstamps,l, label='lower')
plt.plot(tstamps,m, label='mean')
plt.show()
k类型添加到您的对象中。如果您确实需要tuple,则将list替换为zip(upper,fcast_timestamps),其中代码会将map(list,zip(upper,fcast_timestamps)),upper和lower dict添加到结果中。Forecast,只有3个正版null,它将说:
You've got to be joking. I need more data!
"confidence intervals for the mean are much narrower than prediction intervals"
关于python - 如何在python中为时间序列数据创建线性回归预测,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31147594/
出于纯粹的兴趣,我很好奇如何按顺序创建PI,而不是在过程结果之后生成数字,而是让数字在过程本身生成时显示。如果是这种情况,那么数字可以自行产生,我可以对以前看到的数字实现垃圾收集,从而创建一个无限系列。结果只是在Pi系列之后每秒生成一个数字。这是我通过互联网筛选的结果:这是流行的计算机友好算法,类机器算法:defarccot(x,unity)xpow=unity/xn=1sign=1sum=0loopdoterm=xpow/nbreakifterm==0sum+=sign*(xpow/n)xpow/=x*xn+=2sign=-signendsumenddefcalc_pi(digits
关闭。这个问题是opinion-based.它目前不接受答案。想要改进这个问题?更新问题,以便editingthispost可以用事实和引用来回答它.关闭4年前。Improvethisquestion我想在固定时间创建一系列低音和高音调的哔哔声。例如:在150毫秒时发出高音调的蜂鸣声在151毫秒时发出低音调的蜂鸣声200毫秒时发出低音调的蜂鸣声250毫秒的高音调蜂鸣声有没有办法在Ruby或Python中做到这一点?我真的不在乎输出编码是什么(.wav、.mp3、.ogg等等),但我确实想创建一个输出文件。
如何在buildr项目中使用Ruby?我在很多不同的项目中使用过Ruby、JRuby、Java和Clojure。我目前正在使用我的标准Ruby开发一个模拟应用程序,我想尝试使用Clojure后端(我确实喜欢功能代码)以及JRubygui和测试套件。我还可以看到在未来的不同项目中使用Scala作为后端。我想我要为我的项目尝试一下buildr(http://buildr.apache.org/),但我注意到buildr似乎没有设置为在项目中使用JRuby代码本身!这看起来有点傻,因为该工具旨在统一通用的JVM语言并且是在ruby中构建的。除了将输出的jar包含在一个独特的、仅限ruby
我正在使用的第三方API的文档状态:"[O]urAPIonlyacceptspaddedBase64encodedstrings."什么是“填充的Base64编码字符串”以及如何在Ruby中生成它们。下面的代码是我第一次尝试创建转换为Base64的JSON格式数据。xa=Base64.encode64(a.to_json) 最佳答案 他们说的padding其实就是Base64本身的一部分。它是末尾的“=”和“==”。Base64将3个字节的数据包编码为4个编码字符。所以如果你的输入数据有长度n和n%3=1=>"=="末尾用于填充n%
我主要使用Ruby来执行此操作,但到目前为止我的攻击计划如下:使用gemsrdf、rdf-rdfa和rdf-microdata或mida来解析给定任何URI的数据。我认为最好映射到像schema.org这样的统一模式,例如使用这个yaml文件,它试图描述数据词汇表和opengraph到schema.org之间的转换:#SchemaXtoschema.orgconversion#data-vocabularyDV:name:namestreet-address:streetAddressregion:addressRegionlocality:addressLocalityphoto:i
使用带有Rails插件的vim,您可以创建一个迁移文件,然后一次性打开该文件吗?textmate也可以这样吗? 最佳答案 你可以使用rails.vim然后做类似的事情::Rgeneratemigratonadd_foo_to_bar插件将打开迁移生成的文件,这正是您想要的。我不能代表textmate。 关于ruby-使用VimRails,您可以创建一个新的迁移文件并一次性打开它吗?,我们在StackOverflow上找到一个类似的问题: https://sta
我对最新版本的Rails有疑问。我创建了一个新应用程序(railsnewMyProject),但我没有脚本/生成,只有脚本/rails,当我输入ruby./script/railsgeneratepluginmy_plugin"Couldnotfindgeneratorplugin.".你知道如何生成插件模板吗?没有这个命令可以创建插件吗?PS:我正在使用Rails3.2.1和ruby1.8.7[universal-darwin11.0] 最佳答案 随着Rails3.2.0的发布,插件生成器已经被移除。查看变更日志here.现在
exe应该在我打开页面时运行。异步进程需要运行。有什么方法可以在ruby中使用两个参数异步运行exe吗?我已经尝试过ruby命令-system()、exec()但它正在等待过程完成。我需要用参数启动exe,无需等待进程完成是否有任何rubygems会支持我的问题? 最佳答案 您可以使用Process.spawn和Process.wait2:pid=Process.spawn'your.exe','--option'#Later...pid,status=Process.wait2pid您的程序将作为解释器的子进程执行。除
鉴于我有以下迁移:Sequel.migrationdoupdoalter_table:usersdoadd_column:is_admin,:default=>falseend#SequelrunsaDESCRIBEtablestatement,whenthemodelisloaded.#Atthispoint,itdoesnotknowthatusershaveais_adminflag.#Soitfails.@user=User.find(:email=>"admin@fancy-startup.example")@user.is_admin=true@user.save!ende
如何使用RSpec::Core::RakeTask初始化RSpecRake任务?require'rspec/core/rake_task'RSpec::Core::RakeTask.newdo|t|#whatdoIputinhere?endInitialize函数记录在http://rubydoc.info/github/rspec/rspec-core/RSpec/Core/RakeTask#initialize-instance_method没有很好的记录;它只是说:-(RakeTask)initialize(*args,&task_block)AnewinstanceofRake