Introduction to Buffer Overflow
Buffer overflow is also known as Buffer overrun, is a state of the computer where an application tries to store more data in the buffer memory than the size of the memory. This leads to data being stored into adjacent storage, which may sometimes overwrite the existing data, causing potential data loss and sometimes a system crash as well. It is a common programming mistake that most developers commit unknowingly. Hackers are most often exploiting this to gain access to unsolicited data.
What’s Buffer Memory?
Excellent question. A buffer memory, or buffer, is simply a sequential portion of the RAM set aside to hold data temporarily while it is being transferred from one place to another – the place usually being an input or output device. This is done to compensate for the difference in the speeds at which the devices operate.
For e.g., when you give some documents for print, your latest i7 processor is fast enough to execute the print command in nanoseconds, while the poor old printer is not equipped with that fast processor. So, the documents are held onto the buffer memory and passed onto the printer at a speed that the printer accepts. This frees your CPU’s RAM for other tasks.
Buffer Overflow Attack
Now that a vulnerability has been identified with the computers, hackers are bound to exploit it and try to attack various systems through buffer overflow attacks. Now the question arises, how does a hacker execute such an attack and what are the consequences?
In a buffer overflow attack, the extra data includes instructions that are intended to trigger damaging activities such as corrupting files, changing data, sending private information across the internet, etc. An attacker would simply take advantage of any program which is waiting for certain user input and inject surplus data into the buffer.
Buffer overflow attack can be primarily classified into two types
- Stack-based: When the attack is on stack-based memory allocation. This is simpler to exploit and is thus more prone to attacks.
- Heap-based: When the attack is on heap-based memory allocation. This is not so easy to exploit and is thus far less frequent.
The languages most vulnerable to buffer overflow attacks are C, C++, Fortran, and Assembly, as they use stack-based memory allocation techniques.
Once data is corrupt, there is simply no cure to restore the original data. Moreover, the intensity of the attack largely determines the cure. If the attack is meager and affects only a part of an isolated machine’s memory, a simple system format can be the cure. Whereas, if the attack is widespread and has compromised the data over several machines, formatting the entire network would not help unless the program that injects the malicious code is fixed.
Prevention is better than cure.
As developers, it is our responsibility to check for buffer overflows in our code. If buffer overflows are handled in the code itself, the security of the system is not hampered through buffer-overflow attacks.
Following are some simple precautionary steps can help prevent buffer overflows.
- Exception handling must be leveraged to detect buffer overflows and prevent code execution in the event of it.
- Allocate large enough size to buffer so that un-intended large volumes of data are dealt with properly.
- Avoid using library functions or third-party methods that are not bound-checked for buffer overflows. Common examples of such functions to avoid are gets(), scanf(), strcpy() These are primarily in C/C++ language.
- Code testing should account for such vulnerabilities and rigorously test the code and fix bugs that may lead to overflow.
- Modern programming languages, operating systems, and code compilers have evolved to stop the command execution in case of a buffer overflow. This has become the most reliable way to automatically detect buffer-overflows.
Try it yourself
I have understood so much about Buffer Overflow and Buffer Attacks; why not try to code something malicious yourself?
Disclaimer – The following program is for illustration purpose only and should not be used to cause harm of any sort. Any resemblance to malicious code is merely coincidental. Moreover, the operating systems nowadays are smart enough to have buffer-attack-preventive checks in place.
Below is the C program that can cause a potential buffer overrun. Why the choice of language C? This is because the more advanced programming languages were developed to deal with buffer overruns during compile time only. Although, nowadays, the compilers of C also have certain checks to avoid detecting buffer overflow. So, you would only be seeing an error message indicating that a buffer overrun was detected.
int main(int argc, char *argv)
// copy the user input to mybuffer, without any bound checking
printf("Storing user input to mybuffer...\n");
printf("mybuffer content= %s\n", mybuffer);
What happened when 123456789 was given as the command-line argument to the program? The program generated an error that is thrown when the compiler detects the buffer overflow. The compilers and operating systems nowadays have an added protection layer. This layer is nothing but variables called the Canaries which are initiated to certain values at the compile time. These variables are then stored in adjacent memory units to the buffer. So, whenever the buffer overflows, the extra data flows into the adjacent memory and corrupts the value of the Canaries. As soon as any corrupt canary is detected, the system aborts the execution.
Another example in C++ language:
using namespace std;
Input – 123456789
So, by now, I am sure you would have understood the importance of buffer handling in your program. Include this practice to check for buffer bounds while writing as well as testing your code. This will help you write secure code.
This has been a guide to What is Buffer Overflow. Here we discussed the Definition, prevention, memory, attack in Buffer Overflow. You can also go through our other suggested articles to learn more –