Tuesday, September 19, 2017

Using One Defense to Mitigate Bypass of Another

I am a big fan of using several layers of defense, but it is even better if you can use one defense to mitigate the bypass of another. I have written about this a little before. I ran into one of those scenarios recently and wanted to share it because it ties in nicely with my previous entries for both Windows Firewall with Advanced Security (WFwAS) and AppLocker.

In my internet ramblings I happened across the following blog post by David Fletcher of Black Hills Information Security.

https://www.blackhillsinfosec.com/bypassing-cylance-part-4-metasploit-meterpreter-powershell-empire-agent/

The primary purpose of that post was on how they were able to get around a particular security tool. The approach was fascinating and surprisingly simple - rename or move powershell.exe and all of the protections that were intended for that executable will no longer work. This had me thinking - what would this do to WFwAS rules that limit powershell from downloading content from the internet? I immediately knew the answer. Windows firewall rules limiting powershell connections to the internet were limited to the path and executable name. This would go right around it and let the attacker do their business, even as a standard user!

In the words of US Brigadier General Anthony McAuliffe, "NUTS!"
I started thinking about how I would address this, and after a little while an idea came to me. What if I used an AppLocker deny rule to deny powershell.exe from every location except where it is expected to run (and where my firewall rules apply)? Let's get to work!

First I created an AppLocker executable rule set to deny to prevent PowerShell from running. It could be based off of a digital signature (preferred) or hash, but the hash would require maintenance for each version and each time the file was updated.

For the Publisher rule it is important that the File name is specified.
Next, I added an exception based off of the approved path location.

In this case, the exception location is %SYSTEM32%\WindowsPowerShell\v1.0\powershell.exe
That's it! Now to test.

Look at that beautiful red text! 

We can see here that the two technologies working together are effective at stopping this. Windows firewall stops the first attempt to use PowerShell to download external content, and AppLocker stops the workaround!

If possible, rules like this should be configured to ensure (by Publisher or Hash) that all executables with outbound WFwAS rules can only be run from their expected location to ensure that these are not trivially bypassed by moving or renaming the executables.

If a good list of allowed applications was in place this type of activity would be stopped in most cases. This rule could be added to an existing policy for extra assurance or to cover any loopholes in the existing policy. If AppLocker isn't configured in an environment, it would also be possible to configure a new AppLocker policy that did nothing else but block this particular workaround for any executables defined in WFwAS policies. It would just need a deny rule similar to what is shown above, as well as an Allow rule to allow everything else. (I wouldn't recommend this policy by the way, but it would be better than nothing if you were trying to protect WFwAS rules.)

Unfortunately this will only work on Windows Enterprise Edition, I was not able to find a way to do this with Software Restriction Policies. :-(

I think that this is a great illustration of the benefits of defense in depth, even at the endpoint level. If there are ways around a mitigation or a technology it doesn't necessarily mean that technology is no longer useful. It may just mean that we need to get creative and find ways to protect it with the tools available to us.

I have several exciting posts lined up, hopefully I will be able to get them published over the next few weeks.

Until then, work hard and spend time with your family.
Branden
@limpidweb

Monday, July 10, 2017

Scaling AppLocker - Managing it AFTER Deployment!

When working on limiting outbound access to the internet by application with Windows Firewall with Advanced Security I kept running into a key shortcoming - introducing a new or previously unknown application to the system allowed bypasses to firewall rules when using a blocklist configuration. A pre-defined allow configuration of outbound traffic with WFwAS is difficult or nearly impossible to scale, so it wasn't going to solve the problem by itself. WFwAS can do a great job of limiting web access for pre-defined applications in common locations, but it is difficult to use it when there are new applications introduced into folders with different or random path names on thousands of machines. The lack of wildcards or user variables in the path of an application in a rule means that many applications that use the internet could not be managed in a practical manner for more than a small number of machines.

This is one area that an application control policy can help quite a bit. An application control policy can limit newly introduced applications to only those that are trusted (more or less) to behave or have limited abilities to interact with the web programmatically, helping mitigate some of the gaps left by WFwAS. The best bet for a strong security posture on Windows workstations is to use Windows 10 with all of the great mitigations that come with it. For example, Device Guard can be deployed as a very effective deterrent to malware, and many people who are much more knowledgeable than me have discussed it in depth. But what if Device Guard isn't available or isn't enough? There is still a strong case for using AppLocker in many environments, either alone (not as good as Device Guard but better than nothing) or as a more granular supplement to Device Guard.

There is quite a bit of documentation from Microsoft, as well as help on several blogs, with information on how to create an initial AppLocker policy. If you are looking for a place to start, here is a great reference resource; there are many others out there: https://docs.microsoft.com/en-us/windows/device-security/applocker/applocker-overview

A lot of the documentation and blogs discuss creation of an AppLocker policy using a gold image and then pushing it out in a manner that enforces this gold image. It is a good practice and definitely the way to get the most benefit from AppLocker...

...but what about deploying it to a large number of already existing machines that were built over a long period of time with different images of various metallurgic constitutions? What if AppLocker has to be deployed (maybe quickly) into an environment that already has accumulated years of bad habits and many applications that aren't even remotely accounted for by the gold image?

There are a few things that can help make this happen.
  1. Have a clear plan on what to accomplish with AppLocker (see previous link to AppLocker Overview from Microsoft)
  2. Have a method of centralizing and analyzing AppLocker audit logs, such as Windows Event Forwarding (WEF). https://social.technet.microsoft.com/wiki/contents/articles/33895.windows-event-forwarding.aspx
  3. Create a candidate policy based off of a gold image and best practice recommendations.
  4. Test the candidate policy against a representative group in audit mode.
  5. Adjust the policy as necessary using audit logs to inform of blocked files or applications.
  6. Apply the policy in an audit configuration to the whole organization.
  7. Adjust the policy as necessary based off of the audit logs.
  8. Train IT staff and end users on what to expect when the policy is enforced.
  9. When auditing and training is at an acceptable level, enforce the policy.
  10. Continue to adjust policy as needed when something is blocked that should be allowed using enforcement logs.
  11. Continue training.
Sounds pretty straightforward, right? But when I tried this I kept having problems on the steps requiring that the policy be adjusted. When I initially rolled it out I knew of two ways to create an exception to adjust the policy.
  1. Obtain a copy of the application installer and install it on a test machine. Collect all related installer, exe, script, and/or DLL files and use the GPO editor or PowerShell to create new rules.
  2. Copy installer, exe, script, and/or DLL files from a production machine that already has the application installed and use the GPO editor or PowerShell to create new rules.
This worked well when going through the initial deployment, but as the deployment moved from a few machines to a few hundred to thousands these two methods of updating policies didn't scale. The biggest issue was with application installations or updates that executed files or scripts from a temporary location (which would be blocked) and then removed the files from that temporary location when the installation didn't complete. Many of these were triggered by automated update processes that were difficult or time consuming to manually trigger on a test machine. By the time a notification was generated and someone was able to investigate on a production machine the file was already deleted by the installer. Even without those problems, the sheer volume of blocks and time it took to retrieve the files or install them on a test machine to create rules made it difficult to deploy to the entire organization.

In all of the research I did I couldn't find anything that indicated how to solve this problem. All of the guides I had read talked about what to do once I had a local copy of a file, but all I had was a log entry indicating that something had been blocked. I didn't have access to the file, or it took too long to get it, or there were too many.

After a fair amount of researching and frustration I finally found this gem in the Get-AppLockerFileInformation cmdlet documentation:

"The Get-AppLockerFileInformation cmdlet retrieves the AppLocker file information from a list of files or from an event log. File information that is retrieved can include publisher information, file hash information, and file path information. File information from an event log may not contain all of these fields."
 source: https://technet.microsoft.com/en-us/library/ee460961.aspx, emphasis added

The log entry turned out to be the key to solving the problem - and it was staring me in the face the whole time! If I could retrieve the file information from the event log, I could then pipe it into the New-AppLockerPolicy cmdlet and create rules based off of the logs that were already being centralized. This allowed policy modifications for files even if I no longer had access to the file - as long as I had the audit or prevention log event - which meant it could much more easily scale to a full deployment.

Here is a link to some example snippets of code: https://github.com/limpidweb/applocker/blob/master/applocker_ps_examples-1

These can be used as part of a much larger script to help automate some of the more tedious tasks in making AppLocker policy changes, such as version control, partial match searching, policy backups, change control, etc.

Hope this helps! Now to start planning a new post...

Until then, work hard and spend time with your family.
Branden
@limpidweb

Monday, June 12, 2017

What to do about SMB...

So, how busy were you patching after the worm outbreak last month? Hopefully not very busy - hopefully you had MS17-010 patched in March soon after it was released. Unfortunately it appears that many did not, which begs the question, "Why not?". I am sure there are many reasons - I have heard many of them myself in almost every organization I have worked for. So what to do about SMB now that we have seen this worm and others actively taking advantage of unpached systems? Few mitigations are as good as applying the patch. Please patch quickly and thoroughly! Another good option is to stop using SMB1 - although there are still issues with SMB2 that also need to be patched.

But what can be done if none of these are options? What if you need SMB, but machines can't be patched, or they could be patched but can't be rebooted... because of uptime requirements, or business criticality, or the moon is made out of cheese, or whatever other bad reason for not patching exists... another mitigation option is to block SMB with a firewall. For SMB worms, blocking SMB at the edge of your network and through VPN tunnels is a great start, but what if it gets inside the network through some other form? As long as none of the end users execute untrustworthy email attachments or browse the internet while using a local admin account then things are fine. 😉 For discussion sake let's just say there is a very remote chance that these (or other) SMB exploits cold run on an internal machine on some hypothetical network somewhere. Once the worm is in, the fact that it is remotely exploitable means that it can spread to other systems quickly and have a wide impact if not mitigated internally as well.

One way to stop the risk from unpatched (or unknown vulnerable) systems on the network is to use an endpoint firewall to block the service where it is not needed. For example, almost all workstations need SMB enabled outbound when they are acting as a client, especially to things like domain controllers, file servers, and print servers. But how many things really need inbound access SMB shares on workstations? There are a few use cases I could think of, but not many. For example, many admin functions that require SMB inbound to a workstation could be limited to hardened jump boxes or severs. Blocking SMB access from a workstation to other workstations can also have a substantial benefit - things like making workstation to workstation pivoting much harder. This is a micro implementation of that old adage of network segregation/zones, and there is a lot of benefit from blocking just this one protocol.

So if this can help us limit the risk to workstations, how can we limit risk to our servers? Important servers like domain controllers, file servers, and print servers often require SMB and should always be allowed; because of that they should be prioritized to be patched quickly. Outside of these, many servers don't need SMB so it can be turned off, or if a small amount of SMB is needed it can be allowed for just some devices and the rest can be blocked with inbound rules on their endpoint firewall.

If there are servers that can't have the endpoint firewall deployed or SMB service disabled, another option is to create an outbound rule in Windows Firewall with Advanced Security on as many workstations as possible to block traffic destined to SMB ports on devices that are known to be vulnerable. This isn't guaranteed to stop malicious activity, the firewall could potentially be disabled or the initial infection could be from a device without an outbound firewall. But if it is pushed out via GPO many of the highest risk machines could stop most automated infection vectors, which is better than nothing. If there is a similar scenario like this in the future with SMB or another protocol, the same mechanism could be deployed to protect vulnerable devices as a temporary protection until patching/rebooting could kick in.

Here is what an outbound rule to protect other vulnerable devices could look like (high level):
  1. Have outbound block rules with inverse blacklists for commonly abused applications (like I have covered in previous posts - especially for powershell.exe and mshta.exe)
  2. Configure an outbound block rule for protocol TCP/445 (SMB), and use inverse ranges to exclude (go around) servers/devices that have been patched and restarted (preferably things like Domain Controllers, Print Servers, File Servers, etc. that are patched quickly - I do not recommend blocking these). This would be similar to the inverse ranges configured for applications as discussed in previous posts, the difference for this rule would be to allow for Any application, and then specify a protocol, and this time the "skipped" IPs would likely be on an internal network instead of the internet.
  3. Have the default action be Allow.
As devices are patched the inverse list can be modified to "go around" them, allowing traffic to them but blocking other devices that are not patched. In the case of an infection the initial point of entry would be lost, and in many cases accessible file shares could be encrypted (in the case of ransomware), but this could stop the worm from spreading to other workstations or servers that would be vulnerable. This could limit the damage significantly, and instead of having a shutdown of all devices could just be a file restore for the affected shares and workstation re-deploy for the initial infection.

This could be used for other protocols/ports, it certainly would not be limited to SMB. Hopefully there would be enough warning to configure this and push it out to all devices before a new attack hit. A better scenario would be to proactively enable WFP audit logs, forward them to a central location, and analyze them to come up with a proactive list of outbound protocol access to many popular services and only allowed for the destinations that really need them. Here is possible list of places to start - except for really common ports like 80 and 443. If this could be configured and deployed before something was released the chances of having it spread on your network would be much lower. There is also the added (marginal) benefit of becoming aware of when someone is trying to use a new service without going through proper channels. Maybe that server admin wasn't authorized to install MySQL, or maybe that server (or workstation!) wasn't supposed to have a file share or SMTP services configured on it. With something like this in place, the workstations couldn't connect to it without the rules being reconfigured... which provides an opportunity for discussion of securing and managing this new app before it is deployed to production.

I hesitated to publish this, I still recommend patching as quickly as possible, but I hope that this post shows creative ways that the Windows Firewall with Advanced Security can be used to quickly block a specific risk without the risks of deploying a full workstation firewall allow list.

As hinted before I have some interesting things in the works for AppLocker. The hard part is done and now I am polishing it up, I hope to have it published in my next post.

Until then, work hard and spend time with your family.
Branden
@limpidweb

Tuesday, May 9, 2017

Nuances in the Audit Logs

In a previous post I discussed the benefit of the Windows Filtering Platform audit logs and how the Windows Firewall logs were not as useful because they did not include the process information with the log entry. Things have been swamped at work, so I am just now getting around to enhancing some of the alerting that is generated from these WFP event logs. I was excited to dig into this information, my imagination going wild with the idea that nearly every workstation on my network could serve as a sensor.

Using SIEM rules based off of data collected from a large volume of endpoints is one of my favorite ways to test a theory or a set of rules with low risk of impact. My initial thought was to detect and alert on traffic anomalies based off of the username. IT person #1 might need to be using PowerShell between workstations, but Phone Operator 599 does not. If I could alert on this, I thought maybe I could get to the point of writing rules to further limit what applications were talking back and forth on the network. Easy enough, the data is in the logs. To the logs!

“You keep using that word. I do not think it means what you think it means.”
– Inigo Montoya, The Princess Bride
I quickly found out that even though there is a user field in the log entry, it was blank... on every log file I looked at. Not only was this a huge problem for my proposal for some really useful SIEM alerts, I actually had to go back to my previous post and edit it with a correction.

Strike 1. But wait, there's more...

Since I wasn't getting username information, it was time to move on to my next use case. I thought it would be really helpful to detect anomalous traffic between hosts. This would give visibility into traffic flows that I don't get from traditional sources like network boundary firewalls. I relished the idea of being able to use these logs to trigger on large volumes of addresses or ports scanned from a workstation. I even thought I could be sneaky and detect traffic leaving from a compromised host by collecting logs in unexpected ways and forwarding them off box before the attacker knew what I was doing and then using that in creative and innovative ways.

I went about programming SIEM rules to pick up on a handful of scanning scenarios. These rules would be fairly easy to test too... I started off with a quick PowerShell "Test-NetConnection" and saw the results in the SIEM. Success!

I wanted to prove this out on a larger scale, so I fired up some quick nmap scans that would meet my scenarios and then waited for the alerts to fly. And waited... and waited. They didn't happen. I was getting alerts for other things, but nothing for my scans. After reviewing my alert logic I went straight for the events. There were a few UDP packets, but that was it. I looked at my nmap results and there were thousands of packets being sent... why the disconnect?

I dug into the Windows security log on the test machine and saw the exact same thing as in the SIEM. A few UDP packets, but nothing else. Where were the thousands of TCP connections? I knew for sure that TCP packets were leaving my test machine and data was coming back to populate my nmap scan, but why weren't the WFP logs showing this? Were the logs not logging what I thought they were logging? Inconceivable!

I decided to dig into what exactly constituted a connection and find out. After a bit more testing and searching, I stumbled across this page and the following quotes:
"ALE is a set of Windows Filtering Platform (WFP) kernel-mode layers that are used for stateful filtering.
"Stateful filtering keeps track of the state of network connections and allows only packets that match a known connection state.
...
"Filters in the ALE layers authorize inbound and outbound connection creation, port assignments, socket operations such as listen(), raw socket creation, and promiscuous mode receiving.
"Traffic at the ALE layers is classified either per-connection or per-socket operation. At non-ALE layers, filters can only classify traffic on a per-packet basis.
"ALE layers are the only WFP layers where network traffic can be filtered based on the application identity—using a normalized file name—and based on the user identity—using a security descriptor.
...
"For this reason, policies that enforce who (for example, "administrator") and/or which application (for example, "Internet Explorer") are allowed to perform the network operations mentioned above are authored at the ALE layers.
"Traffic at the ALE layers is classified either per-connection or per-socket operation. At non-ALE layers, filters can only classify traffic on a per-packet basis."
After reading this things started to make a little more sense. When I executed nmap it was running as an administrator and it was configured to perform a TCP SYN scan. Since it was being run as an administrator, nmap could create raw sockets and was only sending a SYN packet and moving on without completing the handshake. Since the handshake was never completed, a stateful "connection" was never made. I believe WFP is auditing based off of the ALE layer information. If the TCP handshake isn't completed, a connection isn't made, if a connection isn't made, a WFP Audit Log isn't created, if a WFP Audit Logs isn't created, my super cool SIEM alerts never fire.

To further prove this, I re-ran the nmap scans with the "-sT" option, forcing it to use the OS stack and complete the handshake. My SIEM blew up with the alerts that I had configured. Things worked as expected.

I haven't yet found proof of this other than the events I have described. I have two theories for this:

  1. Since a TCP connection isn't fully established, the ALE layer doesn't classify a SYN scan as a connection and doesn't log it, but UDP and ICMP show every packet (or at a minimum the first packet in every sourceIP/port and destinationIP/port combo) because they are not stateful.
  2. The raw sockets somehow bypass the filtering drivers.
I am currently leaning towards the first theory.

I had a brief glimmer of hope that maybe the firewall logging that I disabled would provide different data and be more helpful - maybe it captured the Transport or Network layer data. But, it doesn't appear to be so. It was very similar to the auditing logs.

Somewhat deflated, I have had to temper my excitement for my network of sensors, at least for detecting outbound connection scenarios. When I tested Sysmon with the "-n" option, it appeared to have the same problem with with outbound detections of SYN scans. I haven't yet verified all of the scenarios around inbound SYN scans with WFP Audit logs. 

Well, at least there isn't an easy way to use TCP SYN to exfil data to hide it from my logs, like programs up to no good like this, or this, or standards that would allow anything to do it like this. :-( Looks like I have some testing to do to see if my network devices are picking up data in SYN packets. From a workstation perspective, it may be that the best option for this kind of data is in the massive data source known as ETW, but probably for other reasons.

I know that using logs from a machine that I am assuming is compromised is a weak and error-prone option, but I was hoping the element of surprise would work in my favor. It looks I am headed back to the drawing board, with this data set limited to applications that play nice with the OS, which is a lot, but know has an important caveat. One more reason to limit administrative accounts for end users and patch to prevent privilege escalation, just in case you needed one.

I hope to move on from Windows Firewall with Advanced Security on my next series of blog posts and focus on Windows Event Forwarding or a recent adventure... AppLocker!

Until then, work hard and spend time with your family.
Branden
@limpidweb