Security management: Threats

About this sub-guideline

This sub-guideline is part of the guideline Security management. It should be read in conjunction with the sub-guideline Security management: Good practices. Refer to the main guideline for context and an overview.

Background

AI systems learn from the data they are fed and then apply models to help them make decisions, generate new content or do anything else they are programmed to do.

For this reason, it is essential that they are fed with correct, clean, unbiased data (for further discussion of this subject, see the guideline Generic risks and biases). Any change to that data, whether intentional or not, may lead to unexpected consequences, results (e.g. a budget management system) or behaviour (e.g. an autonomous car). The end result is akin to teaching improper behaviour or giving wrong information to a child throughout their life.

It is also important to pay attention to the system that will receive user input, send this input to the AI system and then return the result to users. In some cases, attackers can exploit this chain of communication.

Attacks can occur in any phase, from data preparation through to AI system development, deployment and operation (for further discussion of this subject, see the guideline Systems development). As a result, the entire AI system life cycle should be properly supervised in order to minimize unexpected behaviours.

Types of attacks

There are nine common types of attacks on AI systems:

Adversarial attacks
Evasion attacks
Transfer attacks
Data poisoning attacks
Model inversion attacks
Membership inference attacks
Distributed denial of service (DDoS) attacks
Data manipulation attacks
Misuse of AI assistants attacks

Each of these is discussed in turn below.

Adversarial attacks

An adversarial attack involves an attacker manipulating the input data of an AI system so that it produces inaccurate, unexpected or wrong responses. The more information the attacker has about the AI system (especially about the AI model being used), the easier the attack will be. This type of attack often targets AI image recognition systems, making the system incorrectly recognize an image – such as an image of a dog being recognized as a tiger or, worse still, a person being recognized as an animal. The subtle changes in the image are not easily recognizable to the human eye, which makes the issue even harder to solve in certain circumstances. From a parliamentary perspective, an attacker could target a voting system that uses facial recognition technology, causing it to incorrectly allow the attacker to vote as an MP.

Evasion attacks

An evasion attack is considered to be a specific type of adversarial attack. In this case, the attacker intentionally crafts the input data to evade AI detection or classification. For example, the attacker may change the way an unsolicited email is written to avoid being detected by the AI anti-spam system. This can lead to a malicious message getting through to a regular user, who might click a fake link and allow an attacker to gain access to an organization’s network.

*Figure 2: Diagram of an evasion attack*

Transfer attacks

A transfer attack occurs when an attacker uses adversarial-type attacks developed for one model and to deceive other models. The consequences are the same as for an adversarial attack.

Data poisoning attacks

In a data poisoning attack, the attacker adds data to the data set used to train an AI model. The model will learn from incorrect information, leading it to make wrong decisions. For instance, a system could wrongly diagnose a healthy patient as having a deadly cancer – or, worse still, wrongly diagnose a patient with cancer as being healthy, preventing the person from receiving proper treatment. In a parliamentary context, a proposal could be forwarded to the wrong committee for discussion.

*Figure 3: Diagram of a data poisoning attack*

Model inversion attacks

In normal circumstances, an AI model will learn from the input data and produce an output on this basis. The aim of a model inversion attack is to use the output as a way to infer the input data. By doing so, the attacker may gain access to confidential or private information that was used to train the model. For instance, the attacker could get the result of a specific patient’s blood test, or access other, more sensitive data. In a parliamentary context, if an AI model is trained on secret voting data, an attacker may be able to obtain information about how an MP voted.

*Figure 4: Diagram of a model inversion attack*

Membership inference attacks

With a membership inference attack, an attacker aims to find out if individual data records were used to train the AI model. As with a model inversion attack, the attacker may be able to infer sensitive information. For instance, an AI system trained on financial information could leak an individual’s financial history.

*Figure 5: Diagram of a membership inference attack*

Distributed denial of service attacks

A DDoS attack involves an attacker flooding a system – including an AI system – with an excessive number of requests. The aim is to cause the system to stop working, preventing any response or, at least, making it so slow that users will not engage. This often leads to financial losses and reputational harm for the organization running the service. In a parliamentary context, an attacker could bring down the AI chatbot designed to answer questions from citizens during a plenary session discussing a theme with broad public support.

Data manipulation attacks

In a data manipulation attack, the attacker changes the input data (often slightly) in an attempt to generate inaccurate predictions. For instance, an AI system that would normally easily identify a case of fraud may deem such a case to be a regular transaction if the attacker makes minor changes to the input data.

Misuse of AI assistants attacks

The use of AI assistants (such as chatbots or applications built into everyday communication devices) is increasing as these systems become more advanced. It is therefore essential to ensure that a well-thought-out process is in place to select and sanitize the training data set, avoid bias, select the right model for the application, deal with security issues during development, and monitor the application’s usage.

In a parliamentary context, a clerk might use an AI assistant (such as the one pre-installed on their mobile phone) to assist them with daily tasks. However, the results this AI assistant produces may be biased according to the clerk’s political or other personal preferences.

The problem may become more serious if the AI assistant is attacked by a group with particular preferences on any matter parliament may be discussing, especially if this matter is sensitive. Likewise, if parliament decides to develop an AI assistant to support citizens on legislative matters, it must be considered a target for cyberattacks – not least because its audience is often unknown. In this case, prompts need to be at least sanitized prior to their submission as an input to the AI model, in the same way as inputs to any other AI-enabled system.

The Guidelines for AI in parliaments are published by the IPU in collaboration with the Parliamentary Data Science Hub in the IPU’s Centre for Innovation in Parliament. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International licence. It may be freely shared and reused with acknowledgement of the IPU. For more information about the IPU’s work on artificial intelligence, please visit www.ipu.org/AI or contact [email protected].

Impact

National Parliaments

Find a national parliament

Democracy and strong parliaments

Geopolitical groups

12th IPU Global Conference of Young Parliamentarians

Events

Knowledge

Discover the IPU's resources

Security management: Threats

About this sub-guideline

Background

Types of attacks