Regex OR to match one or more patterns - python-3.x

I am currently using python to count the number of PHP array values in a PHP script. The arrays could be multidimensional, paired, or simply a list.
$arr = ['test','test',$test, $test->test,$arr[0][1][1],['test','test'=> 'another test'], array('test','test')];
$arr2 = array('test' => '','test' => '','test' => '','test' => '','test' => '','test' => '','test' => '','test' => '');
$arr3 = [ 'test' => array('test','test','test','test',
'test' => array('test') )];
Notice that I could have an array declared with square brackets or the array keyword.
Currently, I am using the following Python code
R1 = re.findall(r"\[.*\]",String)
for L in R1:
print( len(L.split(',')) )
return None
R1 = re.findall(r"array\(.*\)",String)
for L in R1:
print( len(L.split(',')) )
return None
It seems redundant to use two for loops for this. How can I combine the regex expression to count all array values in the three arrays?

Related

How to use scala strings in list-like pattern matching

So I was reading up about how scala lets you treat string as a sequence of chars through its implicit mechanism. I created a generic Trie class for a general element type and wanted to use it's Char based implementation with string like syntax.
import collection.mutable
import scala.annotation.tailrec
case class Trie[Elem, Meta](children: mutable.Map[Elem, Trie[Elem, Meta]], var metadata: Option[Meta] = None) {
def this() = this(mutable.Map.empty)
#tailrec
final def insert(item: Seq[Elem], metadata: Meta): Unit = {
item match {
case Nil =>
this.metadata = Some(metadata)
case x :: xs =>
children.getOrElseUpdate(x, new Trie()).insert(xs, metadata)
}
}
def insert(items: (Seq[Elem], Meta)*): Unit = items.foreach { case (item, meta) => insert(item, meta) }
def find(item: Seq[Elem]): Option[Meta] = {
item match {
case Nil => metadata
case x :: xs => children.get(x).flatMap(_.metadata)
}
}
}
object Trie extends App {
type Dictionary = Trie[Char, String]
val dict = new Dictionary()
dict.insert( "hello", "meaning of hello")
dict.insert("hi", "another word for hello")
dict.insert("bye", "opposite of hello")
println(dict)
}
Weird thing is, it compiles fine but gives error on running:
Exception in thread "main" scala.MatchError: hello (of class scala.collection.immutable.WrappedString)
at Trie.insert(Trie.scala:11)
at Trie$.delayedEndpoint$com$inmobi$data$mleap$Trie$1(Trie.scala:34)
at Trie$delayedInit$body.apply(Trie.scala:30)
at scala.Function0$class.apply$mcV$sp(Function0.scala:34)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
at scala.App$class.main(App.scala:76)
at Trie$.main(Trie.scala:30)
at Trie.main(Trie.scala)
It's able to implicitly convert String to WrappedString, but that doesn't match the ::. Any workarounds for this?
You can use startsWith as follows:
val s = "ThisIsAString"
s match {
case x if x.startsWith("T") => 1
case _ => 0
}
Or convert your String to List of chars with toList
scala> val s = "ThisIsAString"
s: String = ThisIsAString
scala> s.toList
res10: List[Char] = List(T, h, i, s, I, s, A, S, t, r, i, n, g)
An then use it as any other List
s.toList match {
case h::t => whatever
case _ => anotherThing
}
Your insert method declares item to be a Seq, but your pattern match only matches on List. A string can be implicitly converted to a Seq[Char], but it isn't a List. Use a pattern match on Seq instead of List using +:.
#tailrec
final def insert(item: Seq[Elem], metadata: Meta): Unit = {
item match {
case Seq() =>
this.metadata = Some(metadata)
case x +: xs =>
children.getOrElseUpdate(x, new Trie()).insert(xs, metadata)
}
}
The same applies to your find method.

How to reuse the result from spark stream?

How can we use the value inside the map, it seems the values not being filled.
val goalScore = rawScore.transform(rdd=>{
val minMax = rdd.flatMap(x=>{
x.behaviorProfileType match {
case Some("mapper") => Some((x.sourceType, x.targetType, "mapper"), x)
case Some("non-mappe") => Some((x.sourceType, x.targetType, "non-mapper"), x)
case _ => None
}
})
.reduceByKey(reduceMinMax(_, _))
.collectAsMap()
rdd.map(x => (populateMinMaxWindowGoalScore(x, minMax)))
})
Why minMax is always empty inside populateMinMaxWindowGoalScore function? rawScore is a DStream.

preg_grep to get index instead of value

I have an array like
$arr = array("arif", "arin", "asif", "armin", "arpan");
I want to search and get the indices of the elements which meet a regex.
In this case I wanna get the indices 0, 1, 3, 4 as they match my pattern
$regex = '|^ar|';
Use preg_grep() for that:
<?php
$arr = array("arif", "arin", "asif", "armin", "arpan");
$regex = '|^ar|';
$res = array_keys(preg_grep($regex, $arr));
var_dump($res);
Use preg_match function while iterating through the input array:
$arr = array("arif", "arin", "asif", "armin", "arpan");
$keys = [];
foreach ($arr as $k => $item) {
if (preg_match('/^ar/', $item)) $keys[] = $k;
}
print_r($keys);
The output:
Array
(
[0] => 0
[1] => 1
[2] => 3
[3] => 4
)
Loop over each item in the array, test if your regular expression matches using preg_match the item, if it does, add the index to another array of indexes. If it doesn't match, simply continue. You'll be left with an array of indices.
$words = array("arif", "arin", "asif", "armin", "arpan");
$pattern = '|^ar|';
$indices = array();
foreach ($words as $i => $word) {
// if there is a match
if (preg_match($pattern, $word)) {
// append the current index to the indices array
$indices[] = $i;
}
}

Scala split string and sort data

Hi I am new in scala and I achieved following things in scala, my string contain following data
CLASS: Win32_PerfFormattedData_PerfProc_Process$$(null)|CreatingProcessID|Description|ElapsedTime|Frequency_Object|Frequency_PerfTime|Frequency_Sys100NS|HandleCount|IDProcess|IODataBytesPersec|IODataOperationsPersec|IOOtherBytesPersec|IOOtherOperationsPersec|IOReadBytesPersec|IOReadOperationsPersec|IOWriteBytesPersec|IOWriteOperationsPersec|Name|PageFaultsPersec|PageFileBytes|PageFileBytesPeak|PercentPrivilegedTime|PercentProcessorTime|PercentUserTime|PoolNonpagedBytes|PoolPagedBytes|PriorityBase|PrivateBytes|ThreadCount|Timestamp_Object|Timestamp_PerfTime|Timestamp_Sys100NS|VirtualBytes|VirtualBytesPeak|WorkingSet|WorkingSetPeak|WorkingSetPrivate$$(null)|0|(null)|8300717|0|0|0|0|0|0|0|0|0|0|0|0|0|Idle|0|0|0|100|100|0|0|0|0|0|8|0|0|0|0|0|24576|24576|24576$$(null)|0|(null)|8300717|0|0|0|578|4|0|0|0|0|0|0|0|0|System|0|114688|274432|17|0|0|0|0|8|114688|124|0|0|0|3469312|8908800|311296|5693440|61440$$(null)|4|(null)|8300717|0|0|0|42|280|0|0|0|0|0|0|0|0|smss|0|782336|884736|110|0|0|1864|10664|11|782336|3|0|0|0|5701632|19357696|1388544|1417216|700416$$(null)|372|(null)|8300715|0|0|0|1438|380|0|0|0|0|0|0|0|0|csrss|0|3624960|3747840|0|0|0|15008|157544|13|3624960|10|0|0|0|54886400|55345152|5586944|5648384|2838528$$(null)|424|(null)|8300714|0|0|0|71|432|0|0|0|0|0|0|0|0|csrss#1|0|8605696|8728576|0|0|0|8720|96384|13|8605696|9|0|0|0|50515968|50909184|7438336|9342976|4972544
now I want to find data who's value is PercentProcessorTime, ElapsedTime,.. so for this I first split above string $$ and then again split string using | and this new split string I searched string where PercentProcessorTime' presents and get Index of that string when I get string then skipped first two arrays which split from$$and get data ofPercentProcessorTime` using index , it's looks like complicated but I think following code should helps
// First split string as below
val processData = winProcessData.split("\\$\\$")
// get index here
val getIndex: Int = processData.find(part => part.contains("PercentProcessorTime"))
.map {
case getData =>
getData
} match {
case Some(s) => s.split("\\|").indexOf("PercentProcessorTime")
case None => -1
}
val getIndexOfElapsedTime: Int = processData.find(part => part.contains("ElapsedTime"))
.map {
case getData =>
getData
} match {
case Some(s) => s.split("\\|").indexOf("ElapsedTime")
case None => -1
}
// now fetch data of above index as below
for (i <- 2 to (processData.length - 1)) {
val getValues = processData(i).split("\\|")
val getPercentProcessTime = getValues(getIndex).toFloat
val getElapsedTime = getValues(getIndexOfElapsedTime).toFloat
Logger.info("("+getPercentProcessTime+","+getElapsedTime+"),")
}
Now Problem is that using above code I was getting data of given key in index, so my output was (8300717,100),(8300717,17)(8300717,110)... Now I want sort this data using getPercentProcessTime so my output should be (8300717,110),(8300717,100)(8300717,17)...
and that data should be in lists so I will pass list to case class.
Are you find PercentProcessorTime or PercentPrivilegedTime ?
Here it is
val str = "your very long string"
val heads = Seq("PercentPrivilegedTime", "ElapsedTime")
val Array(elap, perc) = str.split("\\$\\$").tail.map(_.split("\\|"))
.transpose.filter(x => heads.contains(x.head))
//elap: Array[String] = Array(ElapsedTime, 8300717, 8300717, 8300717, 8300715, 8300714)
//perc: Array[String] = Array(PercentPrivilegedTime, 100, 17, 110, 0, 0)
val res = (elap.tail, perc.tail).zipped.toList.sortBy(-_._2.toInt)
//List[(String, String)] = List((8300717,110), (8300717,100), (8300717,17), (8300715,0), (8300714,0))

using phpExcel to read some cells

I'm trying to read some worksheets using a filter, but the problem is that I can't get the cell values, when those cells are using cells
$objReader = PHPExcel_IOFactory::createReader('Excel5');
$objReader->setLoadSheetsOnly('Data Sheet #1');
$myFilter = new CellReadFilter(1, 7, range('A', 'F'));
$objReader->setReadDataOnly(true);
$objReader->setReadFilter($myFilter);
$objPHPExcel = $objReader->load('sampleData/example1.xls');
$loadedSheetNames = $objPHPExcel->getSheetNames();
foreach ($loadedSheetNames as $sheetIndex => $loadedSheetName) {
echo $sheetIndex, ' => ', $loadedSheetName, '<br />';
$sheetData = $objPHPExcel->getActiveSheet()->toArray(null,true,true,true);
var_dump($sheetData);
}
My excel file, on the F1 cell, I use the following formula: =C2, but the output of my script says that F1 is null and not 23.45 as expected.
0 => Data Sheet #1
array (size=3)
1 =>
array (size=6)
'A' => string 'Integer Numbers' (length=15)
'B' => float 123
'C' => float 234
'D' => float -345
'E' => float 456
'F' => null
2 =>
array (size=6)
'A' => string 'Floating Point Numbers' (length=22)
'B' => float 1.23
'C' => float 23.45
'D' => float 3.45E-6
'E' => float -45.678
'F' => float 56.78
3 =>
array (size=6)
'A' => string 'Strings' (length=7)
'B' => string 'Hello' (length=5)
'C' => string 'World' (length=5)
'D' => null
'E' => string 'PHPExcel' (length=8)
'F' => null
and my cell filter class looks like in the documentation sample:
class CellReadFilter implements PHPExcel_Reader_IReadFilter {
private $_startRow = 0;
private $_endRow = 0;
private $_column = array();
public function __construct($startRow, $endRow, $column) {
$this->_startRow = $startRow;
$this->_endRow = $endRow;
$this->_column = array_merge($column, array('AA'));
}
public function readCell($column, $row, $worksheetName = '') {
if ($row >= $this->_startRow && $row <= $this->_endRow) {
if (in_array($column, $this->_column)) { return true; }
}
return false;
}
}
Validation for you read filter, modify the readCell to test what criteria are being applied, and to identify which cells match the criteria and why they are being accepted/rejected:
public function readCell($column, $row, $worksheetName = '') {
echo 'Testing worksheet ', $worksheetName, 'row ', $row, ' column ', $column, PHP_EOL;
if ($row >= $this->_startRow && $row <= $this->_endRow) {
'Cell is within row range',PHP_EOL;
if (in_array($column, $this->_column)) {
'VALID: Cell is within column range',PHP_EOL;
return true;
}
'INVALID: Cell is outside column range',PHP_EOL;
} else {
'INVALID: Cell is outside row range',PHP_EOL;
}
return false;
}

Resources