In a previous discussion, We explained some aspects of vulnerability scanning. In this session, We would like to briefly discuss some basic knowledge about vulnerability scanning in the context of security operations.
Vulnerability scanning is a common method used in information security to assess risks. It is similar to a doctor using an X-ray to examine a patient's body for any issues. Security professionals often use vulnerability scanning to assess whether target systems have vulnerabilities and make decisions on the next steps for security protection.
The principle behind vulnerability scanning involves sending specific requests to remote services and determining the existence of specific vulnerabilities based on the behavior or version information returned by these services.
Impact of vulnerability scanning
1.1 Network impact: The frequency and quantity of network packet requests can have an impact on the network and applications. High request rates may overload switches or routers, leading to a chain reaction and potential service interruptions.
1.2 Impact on exception handling: In some cases, the business may not handle special inputs correctly, leading to exceptions or crashes. For example, a service using a proprietary protocol that happens to listen on TCP port 80 might crash when it receives an HTTP GET request.
1.3 Impact on logs: When scanning public-facing services, each URL probe can result in a 40x or 50x error log entry. The normal monitoring logic of the business relies on the status codes in the access logs. If no action is taken, a sudden increase in 40x errors would require a response from the business's SRE (Site Reliability Engineering) and RD (Research and Development) teams. If they discover that it was triggered by a security engineer and it also caused the impacts mentioned in 1.1 and 1.2, the responsibility will lie with the security engineer.
Issues that arise
For security engineers, not conducting vulnerability scanning may mean being unable to perform their work, identify company risks, or carry out governance tasks. For business stakeholders, vulnerability scanning introduces the risk of disruptions and unavailability of services, which can be a significant concern. Some peers in the industry have faced blame and negative consequences due to these issues, while others have strained relationships with the business side.
Where the problem lies
Vulnerability scanning represents a new change for the business side. Issues are inevitable when introducing such changes, and having no issues would be unusual. The best practice is to follow "change management" principles from ITIL (Information Technology Infrastructure Library).
Change plan: Define the scanning time, IP/URL/port range, query per second (QPS), and test case selection (including DoS test cases, asset selection for Delete/Update operations, and selection of POST hidden interfaces).
Change risk assessment: Evaluate the impact on network traffic, router capacity, business QPS, and the extreme risk of business/network failure.
Change notification: Ensure that the business managers, RD, SRE, DBA, QA, and even network maintenance teams are aware of the key information mentioned above and have authorized the scanning (mandatory notification throughout the company at the very least).
Rollback plan: Prepare a quick response plan to stop scanning and restore business operations if problems occur (some actions may require the cooperation of key stakeholders informed during the change notification).
Change observation: Pay attention to service errors and assess business continuity during the scanning process, enabling prompt responses to any issues.
Change summary: Identify areas for improvement based on the execution of scanning and make adjustments in subsequent work.
Strictly speaking, if security professionals initiate scanning without following these methods and conduct an aggressive scan right away, it is indeed a lack of professionalism on their part. It is not fair to blame the business side for not understanding or supporting these actions.
Recommendations: Aggressive scanning for the public network, cautious approach for the internal network
When classifying based on the network:
Internet-facing public services: These must undergo security checks because if we don't scan them, malicious actors will continuously target them. Instead of being caught off guard and compromised, it is better to have a planned approach and gradually adapt to being scanned.
Under the premise of adhering to change management principles, which may make the deployment process somewhat cumbersome and painful, the business side may need one month to modify monitoring logic (ignoring erroneous requests triggered by the scanner) and make adjustments to handle exceptions triggered by certain scans. In some cases, for unmaintained services that cannot be modified after scanning, whitelisting may be necessary. These adjustments require collaboration and coordination. Once the adjustment phase is complete and the security team can conduct regular and continuous scanning, the aforementioned issues will no longer be a problem.
Internet-facing high-risk services: Only protocol identification is necessary, without performing vulnerability scanning. Having open ports for high-risk services is not recommended, and it is better to directly shut down such services. Scanning for vulnerabilities would only waste resources.
Internet-facing private protocols: Most scanners do not support vulnerability scanning for these protocols, so they should be excluded from the scanning process. However, this may create blind spots, which will not be further discussed here.
Scanning internal services is much more complex than the aforementioned scenarios. On one hand, the probability of external attackers scanning internal services is relatively low. On the other hand, the reliance on traditional privileges within the internal network leads to a higher number of vulnerabilities compared to the public network. Moreover, internal systems are less resilient to scanning, and the likelihood of issues arising is high.
If only port scanning is performed and it is confirmed that the switches and routers can handle it (note: there have been cases where older network devices crashed even with a small increase in scanning requests), it can be relatively acceptable.
However, protocol identification may cause certain vulnerable services to crash, and brute-force attacks could result in account lockouts (which can lead to subsequent incidents). The risks associated with vulnerability scanning are even greater.
Therefore, in many cases, it is not encouraged to rely solely on network scanning for assessing internal network risks. If agents can collect information such as version numbers, configurations, and account-related data, it is possible to gather risk-related data without solely relying on network scanning as the only means.
However, does this mean that there are many risks within the internal network?
The harsh reality is yes, this is the situation in the majority of enterprises today, which is indeed very alarming, isn't it? If certain vulnerabilities are critical (e.g., MS17-010) and specifically target certain port services, the internal network can still follow the standard process mentioned above for scanning. However, conducting full-scale vulnerability scanning across the entire internal network becomes challenging.
These are some operational insights regarding vulnerability scanning. Feel free to refer to them. If you want to learn more, you are welcome to follow our website or contact Shanghai InsightSec Network Technology Co., Ltd. to obtain further knowledge.