python使用pandas抽樣訓(xùn)練數(shù)據(jù)中某個(gè)類別實(shí)例
廢話真的一句也不想多說,直接看代碼吧!
# -*- coding: utf-8 -*- import numpy from sklearn import metrics from sklearn.svm import LinearSVC from sklearn.naive_bayes import MultinomialNB from sklearn import linear_model from sklearn.datasets import load_iris from sklearn.cross_validation import train_test_split from sklearn.preprocessing import OneHotEncoder, StandardScaler from sklearn import cross_validation from sklearn import preprocessing import scipy as spfrom sklearn.linear_model import LogisticRegressionfrom sklearn.feature_selection import SelectKBest ,chi2import pandas as pdfrom sklearn.preprocessing import OneHotEncoder#import iris_data ’’’creativeID,userID,positionID,clickTime,conversionTime,connectionType,telecomsOperator,appPlatform,sitesetID,positionType,age,gender,education,marriageStatus,haveBaby,hometown,residence,appID,appCategory,label’’’ def test(): df = pd.read_table('/var/lib/mysql-files/data1.csv', sep=',') df1 = df[['connectionType','telecomsOperator','appPlatform','sitesetID', 'positionType','age','gender','education','marriageStatus', 'haveBaby','hometown','residence','appCategory','label']] print df1['label'].value_counts() N_data = df1[df1['label']==0] P_data = df1[df1['label']==1] N_data = N_data.sample(n=P_data.shape[0], frac=None, replace=False, weights=None, random_state=2, axis=0) #print df1.loc[:,'label']==0 print P_data.shape print N_data.shape data = pd.concat([N_data,P_data]) print data.shape data = data.sample(frac=1).reset_index(drop=True) print data[['label']] return
補(bǔ)充拓展:pandas實(shí)現(xiàn)對dataframe抽樣
隨機(jī)抽樣
import pandas as pd#對dataframe隨機(jī)抽取2000個(gè)樣本pd.sample(df, n=2000)
分層抽樣
利用sklean中的函數(shù)靈活進(jìn)行抽樣
from sklearn.model_selection import train_test_split#y是在X中的某一個(gè)屬性列X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=0.2, stratify=y)
以上這篇python使用pandas抽樣訓(xùn)練數(shù)據(jù)中某個(gè)類別實(shí)例就是小編分享給大家的全部內(nèi)容了,希望能給大家一個(gè)參考,也希望大家多多支持好吧啦網(wǎng)。
相關(guān)文章:
1. 存儲于xml中需要的HTML轉(zhuǎn)義代碼2. XML入門的常見問題(一)3. ASP實(shí)現(xiàn)加法驗(yàn)證碼4. ASP中if語句、select 、while循環(huán)的使用方法5. ASP.NET MVC使用異步Action的方法6. 匹配模式 - XSL教程 - 47. ASP.NET MVC通過勾選checkbox更改select的內(nèi)容8. JS中map和parseInt的用法詳解9. XML入門精解之結(jié)構(gòu)與語法10. CSS Hack大全-教你如何區(qū)分出IE6-IE10、FireFox、Chrome、Opera
