Powershell Script to Open a Web Page and Bypass SSL Certificate Errors

nbeam published 11 years ago in Microsoft, Powershell, SSL. Tags: Powershell browse website, Powershell bypass SSL certificate warning, Powershell conditional statements, Powershell SSL certificate error, Powershell web scraping

I have been using powershell to automate Internet Explorer interactions with a web application with a login page in our internal environment at work. I ran into an issue with my script because the page I was trying to access was secured with SSL and we were using a self-signed certificate. This causes Internet Explorer to redirect to a warning page rather than going straight to the site. I needed a way to bypass this warning page in code and I finally came up with a solution I am sharing here.

Normally I dive into articles like this with an assumption that people know everything I am talking about. This will be a bit different because on this topic I feel pretty green myself. As this is a fresh subject for me, I am going to treat it as a fresh subject for you. That being said, we are actually going to be working with a somewhat complex script by the time we are done and if you pickup most of the concepts along the way you are going to be off at a sprint with Powershell scripting.

Some general info about powershell syntax for the Uninitiated

Declaring variables looks something like this:

PS C:\$variablename= value

Running that variable from the command-line would then look like this:

PS C:\$variablename
value

If you want to print a message to the CLI shell, you just drop the message in qoutes on a new line.

"some message about what my script is doing"

Conditional Statements in use (I will talk about these in depth later):
While, If, Else

STEP 1: GET IE GOING

First, we are going to talk about running IE (Internet Explorer) inside of powershell. To start a new IE session “com object” you run the following code:

$ie = New-Object -com InternetExplorer.Application

You may have noted that we are actually not just starting a new session but we are also defining that session as the contents of a variable I called $ie. This is important to note because going forward, if we want to interact with our browser session, we interact with the $ie variable which is now something called a com object. You can go more in-depth on that subject here: https://technet.microsoft.com/en-us/library/dd347574.aspx

The next line of code lets you set whether or not you want the browser to be visible. Setting this to “$true” means an IE window will actually pop up when you run your script which is handy for development/diagnostic stuff with your script. When you finalize you script you probably want to set this value to $false.

$ie.visible=$true

Notice that I can use a “.” after my variable to expand into other functionality (methods and properties and more…) For a better explanation of this functionality, I would recommend checking this article out: http://windowsitpro.com/powershell/powershell-basics-variables

To be quite honest, I just kind of learned this without understanding the underlying theory. My personality is such that I am a “search Google, copy and paste stuff to try and see what works” kind of guy. But reading up on the theory can help immensely before or after the fact, so please check out the links I am seeding throughout this article as you have time.

Next, I am going to declare another variable for the URL we want to interact with throughout our script. I am going to use single qoutes instead of double-qoutes and you can read up more on that here: https://technet.microsoft.com/en-us/library/hh847740.aspx

I might be wrong in saying this, but based on my (brief) experience, anything in single quotes tells powershell to interpret the contents EXACTLY as typed. So if I declare a very simple variable like $int= 5 and then I want to use that in some statement printed to the command line, the output looks like this (based on using single vs. double qoutes). Here is an example of this theory, hopefully it makes sense:

PS C:\$int= 5
PS C:\ $int
5
PS C:"$int is equal to 5"
5 is equal to 5
PS C:\'$int is equal to 5'
$int is equal to 5

Notice, when using double quotes, special characters are actually special and $int gets interpreted, however when using single quotes, powershell interprets the contents EXACTLY as typed.

So for a URL, I probably want the contents of the variable to be interpreted explicitly as I don’t want special characters interpreted. So I will declare my URL variable like this:

$url= 'https://www.pcwebshop.co.uk/'

You will noticed that I am using “https://www.pcwebshop.co.uk/” as the page for this script to interact with. There is a specific reason for this. The goal of this article is to show you how to bypass an Internet Explorer SSL Certificate Error in code. This site is a safe site using HTTPS (SSL) that will display a cert error because they are using a self-signed certificate (for exactly this reason… demonstration purposes). If you visit this site you will see this page:

And so will your script which is emulating an IE session. So we need to bypass it if we want to get at the website behind it. That is what we will be getting at by the end of this article. Moving on…

So, in conclusion for step one… we should have this for code:

$ie = New-Object -com InternetExplorer.Application
$ie.visible=$true
$url= 'https://www.pcwebshop.co.uk/'

STEP 2: NAVIGATE TO YOUR URL – REDIRECT TO THE CERT WARNING

$ie.navigate("$url")

This command actually makes our Internet explorer session navigate to the page we specified earlier as the contents of the $url variable.

Next we are going to introduce a bit of logic into our script using a conditional statement.

while($ie.ReadyState -ne 4) {start-sleep -m 100};

The above command uses what is called a “while” statement and the logic works more or less like this…
while (whatever thing I put in here is true) {do this action here}

So the code in-between the “()” specifies a state of being of some kind and the code after that in the “{}” specifies some kind of activity or action. This usage of () and {} is fairly consistent throughout powershell so keep it in mind.

What we are doing here is telling the script to stop moving forward until Internet Explorer reports that the page is actually loaded. You can read a bit more about “$ie.readystate” here: https://msdn.microsoft.com/library/ms534361(v=vs.85).aspx

The short of it though is that there are four different states that IE can report a loading web page as being in and state 4 is supposed to be something along the lines of fully loaded. So our code says WHILE the readystate of our IE browsers is NE (NOT EQUAL) to “4” please sleep (stop running) for 100 minutes. The nature of using “WHILE” from what I can tell is that it will constantly keep checking the value of the ready state. As soon as the ready state equal “4” our script will move on. So the 100 minutes thing doesn’t mean it is going to sleep for 100 full minutes every time it gets looped. It will stop sleeping as soon as the readystate equals “4”.

Now, that is all good and well, but I am going to give you a disclaimer. In my experience readystate isn’t a reliable indicator that the page is fully loaded. I will talk about better ways to check this in a different article as it can screw up your script if you start doing stuff on a page that isn’t loaded yet. There are better ways to check but this will suffice for our purposes here. Moving on…

By the end of step 2, your full code should look like this:

STEP 3: BYPASS THE CERT WARNING

Okay, this is when it gets really fun…

if ($ie.document.url -Match "invalidcert")
{
"Bypassing SSL Certificate Error Page";
$sslbypass=$ie.Document.getElementsByTagName("a") | where-object {$_.id -eq "overridelink"};
$sslbypass.click();
"sleep for 10 seconds while final page loads";
start-sleep -s 10;
};

After our page loads we are going to go into another section of decision logic using an “IF” statement. The same syntax rules that we used for “while” still apply here.

IF (this thing/condition is true) {then please do this action or series of actions}.

So first, what is in our () as a condition?
ie.document.url basically prints the URL of the page IE has reached. So if we had gotten all the way to say pcwebshop.co.uk, ie.document.url would equal “https://www.pcwebshop.co.uk/”.

The -Match operator basically looks for whatever string I put in “” after it in the contents of the item that comes before it. So I am looking for the string “invalidcert” inside of the URL of the page IE has found. “invalidcert” will always appear somewhere in the URL if IE is presenting the certificate error page. So if we are displaying at certificate error page, this “if” statement condition will be true and the code inside of the {} will execute. If it isn’t true (in other words, we aren’t getting a cert error) then this code block will just be skipped which is what we want.

Now, lets examine what action our script is taking. The first line of code is just printing a message to the command line to tell us what is going on. The next command after that is loaded, so lets break it down.

$sslbypass=$ie.Document.getElementsByTagName("a") | where-object {$_.id -eq "overridelink"};

Right off the bat you should notice that we are declaring a new variable I have called “SSLbypass”. The contents of that variable are links. $ie.document.getElementsbyTagName tells powershell to examine the HTML on the page and grab any tags that match what is in the (). So in this case we are grabbing “A” tags which are links.

But we don’t want ALL of the links on the page. We really just want toe button that needs to be clicked in Internet Explorer that will lets us continue on to our web-page. Right now we have collected ALL of the “A” tags and to pare that down we are going to pipe this to another command which will sort through the tags and only return the one we want as the contents of the $sslbypass variable. For that we use “where-object”. If you examine the HTML visually for the cert error page that Internet Explorer pops you will see this:

<A href='' ID="overridelink" NAME="overridelink" >Continue to this website (not recommended).</A>

Note what ID is equal to and compare it to our powershell code and you will quickly see that we are only grabbing tags that have this ID and for this page that means we are only grabbing the link to continue on to our final destination and bypass this warning.

So in the end, $sslbypass is equal to the link we need to click to bypass our error.

Finally, we need to actually “click” on this link and that can be done like this:

$sslbypass.click();

And then finally this:

"sleep for 10 seconds while final page loads";
start-sleep -s 10;
};

So we add two more commands, one that print a message to the command-line and another that tells our script to wait for 10 more seconds (a crude way to make sure the page has time to load). Finally we add a “}” to close out the action block of our “IF” statement and we are all done with this portion.

Shew! That was a bit of explaining. Your entire script should now look like this:

"Starting Internet Explorer in Background"
$ie = New-Object -com InternetExplorer.Application
$ie.visible=$true
$uri = 'https://www.pcwebshop.co.uk/'
$ie.navigate("$uri")
while($ie.ReadyState -ne 4) {start-sleep -m 100};
if ($ie.document.url -Match "invalidcert")
{
"Bypassing SSL Certificate Error Page";
$sslbypass=$ie.Document.getElementsByTagName("a") | where-object {$_.id -eq "overridelink"};
$sslbypass.click();
"sleep for 10 seconds while final page loads";
start-sleep -s 10;
};

Side note about semi-colons:
One other note, you may have noticed that each powershell command in my action block (everything between the “{}” brackets) ends with a semi-colon, “;”. This isn’t required if you are putting this all in script (.ps1) file and executing it that way but I find it does help me keep things clean so I include them. If you are trying to run an “IF” statement just using the powershell command-line then it ALL needs to be on one line (you can’t separate each command in your if statement into a different line) and the semi-colons ARE REQUIRED.
END SIDE NOTE

STEP 4: ADD THE FINISHING TOUCHES AND CLEAN UP

The final bit of our code is going to just give us a nice successful or not successful message and then kill internet explorer so it doesn’t continue running in the background. We will use the conditional “IF” statement with an additional “ELSE” statement. The construction of an if statement looks like this:
IF (this thing/condition is true) {then please do this action or series of actions} ELSE that stuff isn’t true {then do this action or series of actions}
In our code it will look like this:

if ($ie.Document.domain -Match "pcwebshop")
{
"Successfully Bypassed SSL Error";
}
else
{
"Bypass failed";
}
get-process iexplore | stop-process

We are using the $ie.document.domain -Match again to check if our internet explorer browser session has made it to the website. If it has then we print a message saying it was all successful, if it doesn’t match then we print a message saying it failed. Simple enough. The last line: get-process iexplore | stop-process tells the windows machine to rather ungracefully kill all IE processes. There are some gentler ways of going about this but I found this to be the most effective, albeit crude.

So, by the end of step 4, your final and complete script will look like this:

CONCLUSION

That finishes this rather long article up. This script doesn’t do much on its own but in learning how it all works you should now have a pretty firm grasp on some basic powershell coding and concepts. It also is a working solution to a rather tricky problem regarding certificate errors when you are trying to automate browsing of a website for say scraping data. In a future article I will take this script and expand on it to show you how to login to a web page and then scrape some data to a text file. Until then, remember that “drinking and coding” don’t always mix well together but sometimes can provide some rather creative results! Cheers!

References:

Coding Logic:
http://www.computerperformance.co.uk/powershell/powershell_conditional_operators.htm
http://www.computerperformance.co.uk/powershell/powershell_if_statement.htm#Example_2_If_with_Else_
https://technet.microsoft.com/en-us/library/hh847759.aspx
https://4sysops.com/archives/if-else-switch-conditional-statements-in-powershell/
http://blogs.technet.com/b/heyscriptingguy/archive/2009/05/06/how-can-i-use-the-if-statement-in-windows-powershell.aspx
http://www.powershellpro.com/powershell-tutorial-introduction/logic-using-loops/
http://windowsitpro.com/powershell/powershell-basics-variables
https://technet.microsoft.com/en-us/library/hh847740.aspx
http://blogs.technet.com/b/heyscriptingguy/archive/2014/07/14/powershell-string-theory.aspx http://ss64.com/ps/syntax-esc.html

IE Page Navigation:
https://technet.microsoft.com/en-us/library/dd347574.aspx
http://www.powershellmagazine.com/2013/01/31/pstip-retrieve-a-redirected-url-powershell-3-0-way/

IE Page Interaction:
http://stackoverflow.com/questions/22768353/powershell-website-automating-button-click-on-login
http://www.howtogeek.com/124736/stupid-geek-tricks-extract-links-off-any-webpage-using-powershell/
https://social.technet.microsoft.com/Forums/windowsserver/en-US/be3afe83-4a7e-48a0-b2e7-95fd081a7571/login-to-website-using-powershell?forum=winserverpowershell
http://stackoverflow.com/questions/28031495/invoke-webrequest-asp-login-form
https://social.technet.microsoft.com/Forums/en-US/675be952-aea0-42c1-80b5-543a95dc48ec/login-to-site-with-powershell-and-ie-checking-a-form-generated-with-javascript?forum=winserverpowershell
http://www.example-code.com/powershell/http_formAuthentication.asp
https://khr0x40sh.wordpress.com/2014/08/22/powershell-ie-backdoor/
https://msdn.microsoft.com/en-us/magazine/cc337896.aspx

Code Snippets, Tips, Tricks:
https://stackoverflow.com/questions/22510779/can-powershell-wait-until-ie-is-dom-ready/29900361#29900361
http://stackoverflow.com/questions/11610922/powershell-close-internet-explorer-gracefully-cleanly
http://www.mrexcel.com/forum/excel-questions/355550-do-while-ie-busy-ie-readystate-readystate_complete-not-working.html
https://msdn.microsoft.com/library/ms534361(v=vs.85).aspx

1 of 1

13 comments on: Powershell Script to Open a Web Page and Bypass SSL Certificate Errors

Dustin
10 years ago Reply

This is Awesome! Exactly what I needed… I’m getting the error below related to the “click” operation. Any help would be appreciated.

You cannot call a method on a null-valued expression.
At line:12 char:25
+ $sslbypass.click <<<< ();
+ CategoryInfo : InvalidOperation: (click:String) [], RuntimeException
+ FullyQualifiedErrorId : InvokeMethodOnNull
johnrees2014
10 years ago Reply

Good article. You also make a comment above about ReadyState not being reliable. Did you ever write about a better way to check whether the document is ready? You said: In my experience readystate isn’t a reliable indicator that the page is fully loaded. I will talk about better ways to check this in a different article as it can screw up your script if you start doing stuff on a page that isn’t loaded yet

nbeam
10 years ago Reply

Heh… Unfortunately no. This little project came to a halt because apparently internet explorer cannot be ran without a logged-in user on the system. Which pretty much killed what I needed it for. I needed the ability for a system to auto-login to a credentialed form on an internal SSL protected page for whenever other automated processes on the system ran.

Regarding ready-state though… I think what I did end up doing was writing some logic into the script that checked for specific things on the page I was wanting to pull up. So something like:

IF: Check if this text exists: “text”

Then: Continue on with rest of script.

Else: Wait 5 seconds, Check if text exists: “text

That is a hash because I a responding months later in a comment so I apologize :). I am a little rusty. But the basic idea is if you know something that should be constant on the page then you can write logic into your script to look for it. If it exists then continue processing and if not, wait for 5 seconds and then check again.

I hope this is of some benefit. It ends up making the script longer and more complex but I do remember the readystate being pretty useless.

If you really just want to “band-aid” the issue you can just add in a generous wait time after page load initiation. But then the script takes sometimes much longer to run than need be.

johnrees2014
10 years ago

I appreciate you replying to such an old post. Yes, I have seen this approach used on other blogs, and I have always wondered whether if I have to make ‘if “text” exists’ checks, does that mean I may as well eliminate any ReadyState checks, or do they still serve some purpose? Perhaps the ReadyState check at least proves that the browser has navigated to a new page before the script starts checking for particular text? BTW I have also been automating some authentication related IE controlling from PowerShell that may be of interest. It overcomes the problem that IE loses the document property when browsing to a Protected Mode page. Happy to share if you want to know more. john dot rees at nz dot fujitsu dot com.

nbeam
10 years ago Reply

while($ie.document.body.outerHTML -notMatch "<input type=`"submit`" value=`"Continue`">") {start-sleep -m 100};

That is the statement I used in my original script. There was an submit button on the page once loaded. So instead of using the “readystate” I instead opted to have the script constantly check the page for that button. As long as that button wasn’t present, the script would sleep. Hopefully that is helpful 🙂

Pedro
10 years ago Reply

I tried but I received the following error;

Cannot find an overload for “getElementsByTagName” and the argument count: “1”.
At line:7 char:1
+ $sslbypass=$ie.Document.getElementsByTagName(“a”) | where-object {$_.id -eq “ove …
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : NotSpecified: (:) [], MethodException
+ FullyQualifiedErrorId : MethodCountCouldNotFindBest

Can someone help me ?
johnrees2014
10 years ago Reply

A strange error since the Document property should be ready by the time you call getElementsByTagName.

What output do you get when you run:

$ie = New-Object -com InternetExplorer.Application
$ie.visible=$true
$uri = ‘https://www.pcwebshop.co.uk/’
$ie.navigate(“$uri”)
while($ie.ReadyState -ne 4) {start-sleep -m 100};
$ie.Document.url
$ie.Document | Get-Member “get*”
banesh
10 years ago Reply

how can we implement this code using EXCEL VBA.

nbeam
10 years ago Reply

I don’t know visual basic so I am sorry 🙁 – My understanding is the syntax is similar but you may have to go as far as using different methods/modules as I am not sure how much of that overlaps with powershell.

literateaspects
9 years ago Reply

Hello, this resolves the Windows Smart Screen impacted by https://support.microsoft.com/en-us/help/17443/windows-internet-explorer-smartscreen-filter-faq
Amol More
9 years ago Reply

Hay this what i been looking for, from many days but i am trying to run this same program in Vbscript and I am stuck at

$sslbypass=$ie.Document.getElementsByTagName(“a”) | where-object {$_.id -eq “overridelink”};
$sslbypass.click();

Can you please help me get this sorted ?
Ken Riggleman
5 years ago Reply

In Windows 10 I had to use the following for above to work:
$sslbypass=$ie.Document.body.getElementsByTagName(“a”) | where-object {$_.id -eq “overridelink”};

nbeam
5 years ago Reply

Thanks for this update! I wrote this several years ago so this is a helpful update!

Search

Categories