Python实现简单过滤文本段的方法
本文实例讲述了Python实现简单过滤文本段的方法。分享给大家供大家参考,具体如下:
一、问题:
如下文本:
##Alignment0:score=397.0e_value=8.2e-18N=9scaffold1&scaffold106minus 0-0:10026549100077822e-75 0-1:10026550100077818e-150 0-2:10026552100077801e-116 0-3:10026555100077780 0-4:10026570100077680 0-5:10026579100077584e-15 0-6:10026581100077382e-44 0-7:10026587100077349e-145 0-8:10026591100077322e-147 ##Alignment1:score=2304.0e_value=1e-164N=47scaffold1&scaffold107minus 1-0:10026836100079422e-84 1-1:10026839100079400 1-2:10026840100079380 1-3:10026842100079379e-82 1-4:10026843100079357e-79 1-5:10026847100079333e-119 1-6:10026850100079322e-87 1-7:10026854100079285e-22 1-8:10026855100079273e-101 1-9:10026856100079251e-106 1-10:10026857100079240 1-11:10026858100079229e-123 1-12:10026859100079211e-80 1-13:10026860100079208e-104 1-14:10026862100079184e-25 1-15:10026863100079170 1-16:10026864100079124e-40 1-17:10026865100079110 1-18:10026866100079107e-122 1-19:10026867100079082e-25 1-20:10026868100079070 1-21:10026869100079050 1-22:10026870100079043e-150 1-23:10026871100079035e-77 1-24:10026874100079010 1-25:10026875100078970 1-26:10026876100078960 1-27:10026877100078940 1-28:10026880100078933e-52 1-29:10026881100078920 1-30:10026882100078910 1-31:10026883100078900 1-32:10026886100078891e-50 1-33:10026887100078886e-157 1-34:10026888100078870 1-35:10026889100078840 1-36:10026890100078832e-18 1-37:10026891100078829e-64 1-38:10026892100078810 1-39:10026895100078800 1-40:10026898100078750 1-41:10026900100078740 1-42:10026901100078730 1-43:10026902100078712e-123 1-44:10026903100078700 1-45:10026905100078690 1-46:10026909100078681e-81 ##Alignment2:score=811.0e_value=3.3e-43N=17scaffold1&scaffold111minus 2-0:10026595100074496e-40 2-1:10026599100074484e-90 2-2:10026600100074470 2-3:10026601100074449e-55 2-4:10026603100074384e-78 2-5:10026604100074349e-122 2-6:10026606100074322e-162 2-7:10026607100074270 2-8:10026608100074260 2-9:10026612100074170 2-10:10026613100074158e-128 2-11:10026614100074143e-64 2-12:10026615100074090 2-13:10026616100074060 2-14:10026617100074031e-171 2-15:10026618100074020 2-16:10026619100073977e-18 ........
要求:如果Alignment后面少于20行,把整个的去掉
二、实现方法:
python代码:
#!/usr/bin/python sum=0 sumdata=[] FD=open("/root/data.txt","r") line=FD.readline() whileline: ifline.find("Alignment")==3: ifsum>=20: foriinsumdata: printi, sum=0 sumdata=[line] else: sum=sum+1 sumdata.append(line) line=FD.readline() iflen(line)==0: ifsum>=20: foriinsumdata: printi,
附:
perl代码
#!/usr/bin/perl open(FD,"/root/data.txt"); while(){ if($_=~/Alignment/){ if($sum>=20){ print@sumdata;} $sum=0; @sumdata=($_);} else{ $sum++; push(@sumdata,$_);} } print@sumdataif$sum>=20; close(FD);
更多关于Python相关内容感兴趣的读者可查看本站专题:《Python数据结构与算法教程》、《Python函数使用技巧总结》、《Python字符串操作技巧汇总》、《Python入门与进阶经典教程》及《Python文件与目录操作技巧汇总》
希望本文所述对大家Python程序设计有所帮助。