Updated April 15, 2023
Introduction to Apache POI
While developing any application, we may encounter various requirements from the client, which can also include handling the Excel files or showing data to the user in the form of files, so apache poi is an API that solves our problem and manages this kind of work. With the help of this, we can easily handle and manage the MS office files in our application by writing the java code; we must have to do a few configurations in order to use this. This API makes our task easy, quick, and error-free as well, also; we came across a requirement where we have to upload the data of users in the form of files, so in that case also we may require a program that can handle this, so in this place, we can make use of apache POI.
What is Apache POI?
- First of all, this API is provided by Apache Software Foundation, an open-source library, which helps us modify or design our MS files using the java program in place. So, in short, it allows developers or enables us to design, modify, create, and display the MS office files using the Java program.
- This is an API containing the corresponding classes and methods that help us read the user data from the file or write or create any new file based on that user data.
Components Used to Read Apache POI
Given below are the main components which we are going to use while reading Apache POI. This API contains many methods and classes that help us work and handle all the MS office files via the java program.
- HPBF: This is one of the main components of apache poi, which stands for Horrible Publisher Format; this component is used to read and write the files, which are MS-Publisher files specifically.
- HSSF: This is another component that stands for Horrible Spreadsheet Format, this type of component is particularly used to read and write the xls format of the MS Excel files we have.
- HPSF: This is yet another component that stands for Horrible Property Set Format; this component is basically used to extract the property of the MS Office files, which basically includes the property sets.
- HSLF: This is another component that stands for Horrible Slide Layout Format; this is basically used for the PowerPoint presentations, with the support to the operations like edit, create and read.
- HDGF: This is another component that stands for Horrible Diagram Format; this basically contains and handles the binary files; it internally contains methods and classes to handle the MS Visio-related binary files.
- POIFS: This is yet another com of apache poi, which stands for Poor Obfuscation Implementation File System; this is considered as the basic component or the basic factor of all the poi elements we used. If we want to read a different file type, we can use this by writing code explicitly.
- HWPF: This is another component of apache poi that stands for Horrible Word Processor Format; this component basically supports the MS word files with extension doc.
- XSSF: This is another component of apache poi which stands for XML Spreadsheet Format; this component is basically used to read the xlsx extension files of MS Excel.
- XWPF: This is another component that stands for XML Word Processor Format; this component is basically used to read and write the MS Word files with extension docx.
Dependencies and Apache POI Installation
Given below shows how we can use Apache POI inside our java project to handle the files efficiently.
1. Go to the below-mentioned URL and press enter:
2. Select the correct version and copy the dependency to your pom.xml.
<!-- https://mvnrepository.com/artifact/org.apache.poi/poi --> <dependency> <groupId>org.apache.poi</groupId> <artifactId>poi</artifactId> <version>5.0.0</version> </dependency>
1. If you are using gradle, go to the same URL, choose the correct version, and click on gradle to the gradle standard dependency.
Implementation group: ‘org.apache.poi’, name: ‘poi’, version: ‘5.0.0’
2. Add to your build.gradle file in the application.
Normal Java Project:
To add the apache poi to the normal java project, follow the below instruction:
a. Download the dependency into your system by going through the below URL.
b. Extract the folder and place it into the directory you want.
c. Inside eclipse, right-click your project and go to Build Path > Libraries > Add external JARs.
d. Select all the files which are inside (lib, ooxml-lib, and root folder of your poi)
e. Press applies and ok.
- HSSFWorkbook: This class is used to create the object of HSSFWorkbook.
- XSSFWorkbook: This class is used to create the object of XSSFWorkbook.
- HSSFSheet: This class is used to create the object of the HSSFSheet.
- XSSFSheet: This class is used to create the object of the XSSFSheet.
- XSSFRow: This is used to implement the rows.
- XSSFCell: This is used to create the cell interface.
- XSSFCellStyle: This class is used to style the cell.
- It provides stream-based processing.
- It can handle both XLS and XLSX.
- It provides good support for additional excel features.
- It also requires less memory.
It shows step-by-step usages benefits and features of using Apache POI to read write the MS office file via java program with already declared classes and methods, making it quick and easy for the developer to handle it.
This is a guide to Apache POI. Here we discuss the introduction, components used to read apache POI, dependencies & installation, features. You may also have a look at the following articles to learn more –