Find specific sentence in a web page using powershell - string

I need to use powershell to resolve IP addresses via whois. My company filters port 43 and WHOIS queries so the workaround I have to use here is to ask powershell to use a website such as https://who.is, read the http stream and look for the Organisation Name matching the IP address.
So far I have managed to get the webpage read into powershell (example here with a WHOIS on yahoo.com) which is https://who.is/whois-ip/ip-address/206.190.36.45
So here is my snippet:
$url=Invoke-WebRequest https://who.is/whois-ip/ip-address/206.190.36.45
now if I do :
$url.gettype()
IsPublic IsSerial Name BaseType
-------- -------- ---- --------
True False HtmlWebResponseObject Microsoft.PowerShell.Commands.WebResponseObject
I see this object has several properties:
Name MemberType Definition
---- ---------- ----------
Equals Method bool Equals(System.Object obj)
GetHashCode Method int GetHashCode()
GetType Method type GetType()
ToString Method string ToString()
AllElements Property Microsoft.PowerShell.Commands.WebCmdletElementCollection AllElements {get;}
BaseResponse Property System.Net.WebResponse BaseResponse {get;set;}
Content Property string Content {get;}
Forms Property Microsoft.PowerShell.Commands.FormObjectCollection Forms {get;}
Headers Property System.Collections.Generic.Dictionary[string,string] Headers {get;}
Images Property Microsoft.PowerShell.Commands.WebCmdletElementCollection Images {get;}
InputFields Property Microsoft.PowerShell.Commands.WebCmdletElementCollection InputFields {get;}
Links Property Microsoft.PowerShell.Commands.WebCmdletElementCollection Links {get;}
ParsedHtml Property mshtml.IHTMLDocument2 ParsedHtml {get;}
RawContent Property string RawContent {get;}
RawContentLength Property long RawContentLength {get;}
RawContentStream Property System.IO.MemoryStream RawContentStream {get;}
Scripts Property Microsoft.PowerShell.Commands.WebCmdletElementCollection Scripts {get;}
StatusCode Property int StatusCode {get;}
StatusDescription Property string StatusDescription {get;}
but every time I try commands like
$url.ToString() | select-string "OrgName"
Powershell returns the whole HTML code because it interprets the text string as a whole. I found a workaround dumping the output into a file and then read the file through an object (so every line is an element of an array) but I have hundreds of IPs to check so that's not very optimal to create a file all the time.
I would like to know how I could read the content of the web page https://who.is/whois-ip/ip-address/206.190.36.45 and get the line that says :
OrgName: Yahoo! Broadcast Services, Inc.
and just that line only.
Thanks very much for your help! :)

There are most likely better ways to parse this but you were on the right track with you current logic.
$web = Invoke-WebRequest https://who.is/whois-ip/ip-address/206.190.36.45
$web.tostring() -split "[`r`n]" | select-string "OrgName"
Select-String was returning the match as it, previously, was one long string. Using -split we can break it up to just get the return you expected.
OrgName: Yahoo! Broadcast Services, Inc.
Some string manipulation after that will get a cleaner answer. Again, many ways to approach this as well
(($web.tostring() -split "[`r`n]" | select-string "OrgName" | Select -First 1) -split ":")[1].Trim()
I used Select -First 1 as select-string could return more than one object. It would just ensure we are working with 1 when we manipulate the string. The string is just split on a colon and trimmed to remove the spaces that are left behind.
Since you are pulling HTML data we could also walk through those properties to get more specific results. The intention of this was to get 1RedOne answer
$web = Invoke-WebRequest https://who.is/whois-ip/ip-address/206.190.36.45
$data = $web.AllElements | Where{$_.TagName -eq "Pre"} | Select-Object -Expand InnerText
$whois = ($data -split "`r`n`r`n" | select -index 1) -replace ":\s","=" | ConvertFrom-StringData
$whois.OrgName
All that data is stored in the text of the PRE tag in this example. What I do is split up the data into its sections (Sections are defined with blank lines separating them. I look for consecutive newlines). The second group of data contains the org name. Store that in a variable and pull the OrgName as a property: $whois.OrgName. Here is what $whois looks like
Name Value
---- -----
Updated 2013-04-02
City Sunnyvale
Address 701 First Ave
OrgName Yahoo! Broadcast Services, Inc.
StateProv CA
Country US
Ref http://whois.arin.net/rest/org/YAHO
PostalCode 94089
RegDate 1999-11-17
OrgId YAHO
You can also make that hashtable into a custom object if you prefer dealing with those.
[pscustomobject]$whois
Updated : 2017-01-28
City : Sunnyvale
Address : 701 First Ave
OrgName : Yahoo! Broadcast Services, Inc.
StateProv : CA
Country : US
Ref : https://whois.arin.net/rest/org/YAHO
PostalCode : 94089
RegDate : 1999-11-17
OrgId : YAHO

it it very simple to use whois app this is for microsoft put app in System32 or windir and in powershell use whois command then get-string get "orgname" like this
PS C:\> whois.exe -v 206.190.36.45 | Select-String "Registrant Organization"
Registrant Organization: Yahoo! Inc.
I advise you this app because has more information for your work

Here you go, the way to do this is in fact to do an Invoke-WebRequest. If we take a look at some of the properties of the object we get from Invoke-WebRequest, we can see that PowerShell has already parsed some of the HTML and text for us.
All that we have to do is pick out some of the values we'd like to work with. For instance, taking a peek at the ParsedText field, we see these results.
These fields begin on about line 30 or so. In my approach to solving this problem we know that we'll find good data like this mid-way down the page, so if we could scrape the values from these lines, we'd be on our way to working with the data. The code to accomplish this first part is this:
$url = "https://who.is/whois-ip/ip-address/$ipaddress"
$Results = Invoke-WebRequest $url
$ParsedResults = $Results.ParsedHtml.body.outerText.Split("`n")[30..50]
Now, PowerShell has a number of very powerful commands to import and convert data into various formats. For instance, if we could only replace the ':' colon character with an equals sign '=', we could send the whole mess over to ConverFrom-StringData and have rich PowerShell objects to work with. It turns out that we can easily do that using the universal -Replace operator, like this
$Results.ParsedHtml.body.outerText.Split("`n")[30..50] -replace ":","="
I figured you might want to do this again in the future, so I took the entire thing and made it into a simple five line function for you. Throw this into your $Profile and enjoy.
So the finished result looks like this:
Function Get-WhoIsData {
param($ipaddress='206.190.36.45')
$url = "https://who.is/whois-ip/ip-address/$ipaddress"
$Results = Invoke-WebRequest $url
$ParsedResults = $Results.ParsedHtml.body.outerText.Split("`n")[30..50] -replace ":","=" | ConvertFrom-StringData
$ParsedResults }
and using it works this way:
PS C:\windows\system32> Get-WhoIsData -ipaddress 206.190.36.45
Name Value
---- -----
NetRange 206.190.32.0 - 206.190.63.255
CIDR 206.190.32.0/19
NetName NETBLK1-YAHOOBS
NetHandle NET-206-190-32-0-1
Parent NET206 (NET-206-0-0-0-0)
NetType Direct Allocation
OriginAS
Organization Yahoo! Broadcast Services, Inc. (YAHO)
RegDate 1995-12-15
Updated 2012-03-02
Ref http=//whois.arin.net/rest/net/NET-206-190-32-0-1
OrgName Yahoo! Broadcast Services, Inc.
OrgId YAHO
Address 701 First Ave
City Sunnyvale
StateProv CA
PostalCode 94089
You can then select any of the properties you'd like using normal Select-Object or Where-Object commands. For example, to pull out just the orgName property, you'd use this command:
(Get-WhoIsData).OrgName
>Yahoo! Broadcast Services, Inc.

Related

Need to apply an if condition based on a check in Powershell

I am new to Powershell. I am actually getting the details of the azure data factory linked services but after get I need to use contains to check if the element exists. In python I would just check if string in a list but powershell not quite sure. Please check the code below.
$output = Get-AzDataFactoryV2LinkedService -ResourceGroupName $ResourceGroupName -DataFactoryName "xxxxxxxx" | Format-List
The output of the below is :
sample output given below
LinkedServiceName : abcdef
ResourceGroupName : ghijk
DataFactoryName : lmnopq
Properties : Microsoft.Azure.Management.DataFactory.Models.AzureDatabricksLinkedService
So now I try to do this:
if ($output.Properties -contains "Microsoft.Azure.Management.DataFactory.Models.AzureDatabricksLinkedService") {
Write-Output "test output"
}
But $output.Properties gives us the properties of that json.
I need to check if "Microsoft.Azure.Management.DataFactory.Models.AzureDatabricksLinkedService" exists in output variable and perform the required operations. Please help me on this.
The -contains operator requires a collection and an element. Here's a basic example of its proper use:
$collection = #(1,2,3,4)
$element1 = 5
$element2 = 3
if ($collection -contains $element1) {'yes'} else {'no'}
if ($collection -contains $element2) {'yes'} else {'no'}
What you've done is ask PowerShell to look in an object that isn't a collection for an element of type [string] and value equal to the name of that same object.
What you need to do is inspect this object:
$output.Properties | format-list *
Then once you figure out what needs to be present inside of it, create a new condition.
$output.Properties.something -eq 'some string value'
...assuming that your value is a string, for example.
I would recommend watching some beginner tutorials.

Import-Excel is showing a random string instead of the sheet

I'm working currently on a project to automate some stuff. However I'm quited blocked with a problem in reading an excel file. I have an Excel file (related to MS Form so basically it does contain the results of a form ans it is shared on MS Sharepoint), I tried importing it using Powershell like this :
$value=Import-Excel "C:\Users\sth\sth1\MyFile.xlsx"
The problem is that $value gives me this as a result instead of the table:
TXzkw9Vz0Own80zEzjQhphDvtf05gBAi4P some other stuff
--------------------------------------------------------------------------------
Form1
{d23c5acf-c4d2-47d5-9784-another random string}
FYI I did another test with another .xlsx file not linked to MS Forms and it worked.
EDIT:
$value | Get-Member gives this:
TypeName: System.Management.Automation.PSCustomObject
Name MemberType Definition
---- ---------- ----------
Equals Method bool Equals(System.Object obj)
GetHashCode Method int GetHashCode()
GetType Method type GetType()
ToString Method string ToString()
cTXzkwEzjQhphDvtf05gBAi4PdtR-BB_JUOU5WSVFVTlFQNFZONERJQ1hGWDg1N0xOOS4u NoteProperty string cTXzkw9Vz0Own80zEzjQhphDvtf05gBAi4PdtR-BB_JUOU5WONERJQ1hGWDg1N0xOOS4u=Form1
EDIT 2:
The table in the excel should be like this
ID, Email, Question1,Questin 2
----------------------------------
1 Email1 answer 1 Answer3
2 Email2 answer2 Answer 4
...
Does anyone has an clue about this ?
Thank you very much !!

What is the ParameterizedProperty Chars property on the string type added there by the Powershell?

Please, observe:
C:\> ''|Get-Member |? { $_.MemberType -eq 'ParameterizedProperty' }
TypeName: System.String
Name MemberType Definition
---- ---------- ----------
Chars ParameterizedProperty char Chars(int index) {get;}
C:\>
This is a very weird property. First of all it is added by Powershell, next it contains an infinite recursive property:
C:\> ''.Chars
IsSettable : False
IsGettable : True
OverloadDefinitions : {char Chars(int index) {get;}}
TypeNameOfValue : System.Char
MemberType : ParameterizedProperty
Value : char Chars(int index) {get;}
Name : Chars
IsInstance : True
C:\> ''.Chars.Value
IsSettable : False
IsGettable : True
OverloadDefinitions : {char Chars(int index) {get;}}
TypeNameOfValue : System.Char
MemberType : ParameterizedProperty
Value : char Chars(int index) {get;}
Name : Chars
IsInstance : True
C:\> ''.Chars.GetHashCode()
56544304
C:\> ''.Chars.Value.GetHashCode()
34626228
C:\> ''.Chars.Value.Value.GetHashCode()
3756075
C:\> ''.Chars.Value.Value.Value.GetHashCode()
49108342
C:\> ''.Chars.Value.Value.Value.Value.GetHashCode()
62340979
C:\> ''.Chars.Value.Value.Value.Value.Value.GetHashCode()
24678148
C:\>
The hash code is different every time, so it must be dynamically generated.
Why do I care? I am trying to use a Newtonsoft.Json PowerShell module from PSGallery and it chokes on this property, but only when run in Desktop PowerShell (5.1), not the Core (7.0.3). The problem is that I do not have a minimal reproduction, the input object is quite large. The error I get is:
ConvertTo-JsonNewtonsoft : Exception calling "SerializeObject" with "2" argument(s): "Self referencing loop detected for property 'Value' with type 'System.Management.Automation.PSParameterizedProperty'. Path 'environments[4].conditions.name.Chars'."
No such problem exists in PS Core.
Can someone explain to me what is this property, why we need it and how can we get rid of it?
EDIT 1
I guess it is a problem with the Newtonsoft.Json module. Observe:
[DBG]> [pscustomobject]#{ a = 1} | ConvertTo-Json
{
"a": 1
}
[DBG]> [pscustomobject]#{ a = 1} | ConvertTo-JsonNewtonsoft
{
"CliXml": "<Objs Version=\"1.1.0.1\" xmlns=\"http://schemas.microsoft.com/powershell/2004/04\">\r\n <Obj RefId=\"0\">\r\n <TN RefId=\"0\">\r\n
<T>System.Management.Automation.PSCustomObject</T>\r\n <T>System.Object</T>\r\n </TN>\r\n <ToString>#{a=1}</ToString>\r\n <Obj RefId=\"1\">\r\n <TNRef RefId=\"0\" />\r\n <MS>\r\n <I32 N=\"a\">1</I32>\r\n </MS>\r\n </Obj>\r\n <MS>\r\n <I32 N=\"a\">1</I32>\r\n </MS>\r\n </Obj>\r\n</Objs>"
}
[DBG]>
It is unable to properly interpret powershell objects. Makes it unusable.
tl;dr
Your real problem is that neither the Newtonsoft.Json library nor the PowerShell wrapper module for it support [pscustomobject] instances:
The library asks [pscustomobject] instances to serialize themselves, based on the [pscustomobject] ([psobject]) implementing the ISerializable interface.
In Windows PowerShell this happens to fail outright, presumably due to the bundled version of the Newtonsoft.Json.dll assembly being quite old (as of this writing, the bundled version is 8.0, whereas 12.0is current) and having a bug
The manifestation of this bug is the Self referencing loop detected for property 'Value' ... bug you saw.
In PowerShell [Core] v6+, the newer version of Newtonsoft.Json.dll that ships with PowerShell itself preempts the obsolete version, so the error doesn't occur, but the serialization problem becomes obvious:
The resulting { "CliXml": "<Objs Version=\"1.1.0.1\" .. } JSON text shows that the [pscustomobject] instance was serialized in CLIXML format, PowerShell's native XML-based serialization format, as notably used by PowerShell's remoting feature.
It is hypothetically possible - though very cumbersome - to manually deserialize such JSON by post-processing it and replacing objects with only a CliXml property with the return value from [System.Management.Automation.PSSerializer]::Deserialize()
Solutions:
If your intent is simply to compare serialized representations in a PS-edition-agnostic form, irrespective of the specific serialization format, consider using CLIXML directly, via Export-CliXml and Import-CliXml.
If you do want a PS-edition-agnostic way to serialize to JSON, you'll have to roll your own [pscustomobject]-to-ordered-hashtable converter, because serializing ordered hashtables ([ordered] #{ ... }, System.Collections.Specialized.OrderedDictionary) via Newtonsoft.Json does round-trip properly in PowerShell (it is, in fact, the data structure used by the ConvertFrom/To-JsonNewtonsoft wrapper cmdlets).
Both approaches are demonstrated in this related answer.

Powershell: Loading all items/properties into a new object

Take this code:
$logged_on_user = get-wmiobject win32_computersystem | select username
If I want to output the value into a new string I'd do something like:
$A = $logged_on_user.username
However, if I do the following:
$logged_on_user = get-wmiobject win32_computersystem | select *
..to try to assign all the values to a new "object", do I?:
$logged_on_user.items
$logged_on_user.value
$logged_on_user.text
$logged_on_user.propertry
I've tried them all and they don't work.
Anybody got any ideas?
Thanks
P.S. I think I may have got the title of this question wrong.
In your example:
$logged_on_user = get-wmiobject win32_computersystem | select username
creates a new PSCustomObject with a single property - username. When you do the following:
$A = $logged_on_user.username
you are assigning the return value of the PSCustomObject's username property to a variable $A. Because the return type of the username property is a string, $A will also be a string.
When executing the following:
$cs = get-wmiobject win32_computersystem
If you assign $cs to a new variable like in the following:
$newVariable = $cs
Then $newVariable will reference the same object $cs does, so all properties and methods that are accessible on $cs will also be accessible on $newVariable.
If you don't specify any properties or call any methods on an object when assigning a return value to another variable, then the return value is the object itself, not the return value of one of the object's properties or methods.
Additional info, but not directly related to the question:
When you pipe the output of get-wmiobject to select-object, like in the following:
$cs = get-wmiobject win32_computersystem | select-object *
The variable $cs is of type: PSCustomObject as opposed to ManagementObject (as it is when you do not pipe to Select-Object) which has all of the same properties and their values that the ManagementObject that was piped in did.
So, if you only want the property values contained by the ManagementObject, there is no need to pipe the output to Select-Object as this just creates a new object (of type PSCustomObject) with the values from the MangementObject. Select-Object is useful when you either want to select a subset of the properties of the object that is being piped in, or if you want to create a new PSCustomObject with different properties that are calculated through expressions.
I'm not sure if you're asking about copying the results of Get-WmiObject or PowerShell objects in general. In the former case, Get-WmiObject returns instances of the ManagementObject class, which implements the ICloneable interface that provides a Clone method. You can use it like this...
$computerSystem = Get-WmiObject -Class 'Win32_ComputerSystem';
$computerSystemCopy = $computerSystem.Clone();
After the above code executes, $computerSystem and $computerSystemCopy will be identical but completely separate ManagementObject instances. You can confirm this by running...
$areSameValue = $computerSystem -eq $computerSystemCopy;
$areSameInstance = [Object]::ReferenceEquals($computerSystem, $computerSystemCopy);
...and noting that $areSameValue is $true and $areSameInstance is $false.

Programmatically create View with multiple content types

I have a custom list that has 1 content type. That content type has a parent content type that it inheieritits from.
Type 1 has 3 fields:
Field A,
Field B,
Field C
Type 2 has 2 fields and inherits Type 1:
Field D,
Field E
I am Programmatically creating a few views. When I do this through the SP UI it works great, not complaints, but when I do it in a PowerShell script, like so:
$web = Get-SPWeb [Site URL]
$list = $web.Lists[ListName]
$list.Views.Add($viewName, $includeFieldsCollection, $query, 100, $true, $false)
$web.Dispose()
Where $includeFieldsCollection is all fields A-E.
I get the error:
Exception calling "Add" with "6" argument(s): "Column 'Field A' does not exist. It may have been deleted by another user."
How can I do this in PowerShell? It does not want to see the columns that its getting from the higher scope. If I look at the SP UI they show up just fine.
Thank you.
Edit: I have to correct the issue.
Are you sure that the column names you are passing to the powershell function are the internal names?
http://msdn.microsoft.com/en-us/library/ms480493.aspx
Normally spaces are replaced with _x0200_
Field A would be Field_x0200_A
I would make sure the list I'm working with is actually using the correct content type and fields. You can check them using:
$list.fields | format-table
and
$list.contenttypes | format-table

Resources