forked from mladjenigor/spout
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
merge master, resolve conflicts (box#447)
- Loading branch information
Showing
4 changed files
with
8 additions
and
362 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,10 +1,12 @@ | ||
language: php | ||
|
||
dist: trusty | ||
|
||
php: | ||
- 5.6 | ||
- 7.0 | ||
- 7.1 | ||
- hhvm | ||
- hhvm-3.6 | ||
|
||
cache: | ||
directories: | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -12,29 +12,10 @@ Contrary to other file readers or writers, it is capable of processing very larg | |
|
||
Join the community and come discuss about Spout: [](https://gitter.im/box/spout?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge) | ||
|
||
## Installation | ||
|
||
### Composer (recommended) | ||
## Documentation | ||
|
||
Spout can be installed directly from [Composer](https://getcomposer.org/). | ||
|
||
Run the following command: | ||
``` | ||
$ composer require box/spout | ||
``` | ||
|
||
### Manual installation | ||
|
||
If you can't use Composer, no worries! You can still install Spout manually. | ||
|
||
> Before starting, make sure your system meets the [requirements](#requirements). | ||
1. Download the source code from the [Releases page](https://github.com/box/spout/releases) | ||
2. Extract the downloaded content into your project. | ||
3. Add this code to the top controller (index.php) or wherever it may be more appropriate: | ||
```php | ||
require_once '[PATH/TO]/src/Spout/Autoloader/autoload.php'; // don't forget to change the path! | ||
``` | ||
Full documentation can be found at [http://opensource.box.com/spout/](http://opensource.box.com/spout/). | ||
|
||
|
||
## Requirements | ||
|
@@ -44,301 +25,6 @@ require_once '[PATH/TO]/src/Spout/Autoloader/autoload.php'; // don't forget to c | |
* PHP extension `php_xmlreader` enabled | ||
|
||
|
||
## Basic usage | ||
|
||
### Reader | ||
|
||
Regardless of the file type, the interface to read a file is always the same: | ||
|
||
```php | ||
use Box\Spout\Reader\ReaderFactory; | ||
use Box\Spout\Common\Type; | ||
|
||
$reader = ReaderFactory::create(Type::XLSX); // for XLSX files | ||
//$reader = ReaderFactory::create(Type::CSV); // for CSV files | ||
//$reader = ReaderFactory::create(Type::ODS); // for ODS files | ||
|
||
$reader->open($filePath); | ||
|
||
foreach ($reader->getSheetIterator() as $sheet) { | ||
foreach ($sheet->getRowIterator() as $row) { | ||
// do stuff with the row | ||
} | ||
} | ||
|
||
$reader->close(); | ||
``` | ||
|
||
If there are multiple sheets in the file, the reader will read all of them sequentially. | ||
|
||
### Writer | ||
|
||
As with the reader, there is one common interface to write data to a file: | ||
|
||
```php | ||
use Box\Spout\Writer\WriterFactory; | ||
use Box\Spout\Common\Type; | ||
|
||
$writer = WriterFactory::create(Type::XLSX); // for XLSX files | ||
//$writer = WriterFactory::create(Type::CSV); // for CSV files | ||
//$writer = WriterFactory::create(Type::ODS); // for ODS files | ||
|
||
$writer->openToFile($filePath); // write data to a file or to a PHP stream | ||
//$writer->openToBrowser($fileName); // stream data directly to the browser | ||
|
||
$writer->addRow($singleRow); // add a row at a time | ||
$writer->addRows($multipleRows); // add multiple rows at a time | ||
|
||
$writer->close(); | ||
``` | ||
|
||
For XLSX and ODS files, the number of rows per sheet is limited to 1,048,576. By default, once this limit is reached, the writer will automatically create a new sheet and continue writing data into it. | ||
|
||
|
||
## Advanced usage | ||
|
||
If you are looking for how to perform some common, more advanced tasks with Spout, please take a look at the [Wiki](https://github.com/box/spout/wiki). It contains code snippets, ready to be used. | ||
|
||
### Configuring the CSV reader and writer | ||
|
||
It is possible to configure both the CSV reader and writer to specify the field separator as well as the field enclosure: | ||
```php | ||
use Box\Spout\Reader\ReaderFactory; | ||
use Box\Spout\Common\Type; | ||
|
||
$reader = ReaderFactory::create(Type::CSV); | ||
$reader->setFieldDelimiter('|'); | ||
$reader->setFieldEnclosure('@'); | ||
$reader->setEndOfLineCharacter("\r"); | ||
``` | ||
|
||
Additionally, if you need to read non UTF-8 files, you can specify the encoding of your file this way: | ||
```php | ||
$reader->setEncoding('UTF-16LE'); | ||
``` | ||
|
||
By default, the writer generates CSV files encoded in UTF-8, with a BOM. | ||
It is however possible to not include the BOM: | ||
```php | ||
use Box\Spout\Writer\WriterFactory; | ||
use Box\Spout\Common\Type; | ||
|
||
$writer = WriterFactory::create(Type::CSV); | ||
$writer->setShouldAddBOM(false); | ||
``` | ||
|
||
|
||
### Configuring the XLSX and ODS readers and writers | ||
|
||
#### Row styling | ||
|
||
It is possible to apply some formatting options to a row. Spout supports fonts, background, borders as well as alignment styles. | ||
|
||
```php | ||
use Box\Spout\Common\Type; | ||
use Box\Spout\Writer\WriterFactory; | ||
use Box\Spout\Writer\Common\Creator\Style\StyleBuilder; | ||
use Box\Spout\Writer\Style\Color; | ||
|
||
$style = (new StyleBuilder()) | ||
->setFontBold() | ||
->setFontSize(15) | ||
->setFontColor(Color::BLUE) | ||
->setShouldWrapText() | ||
->setBackgroundColor(Color::YELLOW) | ||
->build(); | ||
|
||
$writer = WriterFactory::create(Type::XLSX); | ||
$writer->openToFile($filePath); | ||
|
||
$writer->addRowWithStyle($singleRow, $style); // style will only be applied to this row | ||
$writer->addRow($otherSingleRow); // no style will be applied | ||
$writer->addRowsWithStyle($multipleRows, $style); // style will be applied to all given rows | ||
|
||
$writer->close(); | ||
``` | ||
|
||
Adding borders to a row requires a ```Border``` object. | ||
|
||
```php | ||
use Box\Spout\Common\Type; | ||
use Box\Spout\Writer\Style\Border; | ||
use Box\Spout\Writer\Common\Creator\Style\BorderBuilder; | ||
use Box\Spout\Writer\Style\Color; | ||
use Box\Spout\Writer\Common\Creator\Style\StyleBuilder; | ||
use Box\Spout\Writer\WriterFactory; | ||
|
||
$border = (new BorderBuilder()) | ||
->setBorderBottom(Color::GREEN, Border::WIDTH_THIN, Border::STYLE_DASHED) | ||
->build(); | ||
|
||
$style = (new StyleBuilder()) | ||
->setBorder($border) | ||
->build(); | ||
|
||
$writer = WriterFactory::create(Type::XLSX); | ||
$writer->openToFile($filePath); | ||
|
||
$writer->addRowWithStyle(['Border Bottom Green Thin Dashed'], $style); | ||
|
||
$writer->close(); | ||
``` | ||
|
||
Spout will use a default style for all created rows. This style can be overridden this way: | ||
|
||
```php | ||
$defaultStyle = (new StyleBuilder()) | ||
->setFontName('Arial') | ||
->setFontSize(11) | ||
->build(); | ||
|
||
$writer = WriterFactory::create(Type::XLSX); | ||
$writer->setDefaultRowStyle($defaultStyle) | ||
->openToFile($filePath); | ||
``` | ||
|
||
Unfortunately, Spout does not support all the possible formatting options yet. But you can find the most important ones: | ||
|
||
| Category | Property | API | ||
|-----------|---------------|--------------------------------------- | ||
| Font | Bold | `StyleBuilder::setFontBold()` | ||
| | Italic | `StyleBuilder::setFontItalic()` | ||
| | Underline | `StyleBuilder::setFontUnderline()` | ||
| | Strikethrough | `StyleBuilder::setFontStrikethrough()` | ||
| | Font name | `StyleBuilder::setFontName('Arial')` | ||
| | Font size | `StyleBuilder::setFontSize(14)` | ||
| | Font color | `StyleBuilder::setFontColor(Color::BLUE)`<br>`StyleBuilder::setFontColor(Color::rgb(0, 128, 255))` | ||
| Alignment | Wrap text | `StyleBuilder::setShouldWrapText(true|false)` | ||
|
||
#### New sheet creation | ||
|
||
It is also possible to change the behavior of the writer when the maximum number of rows (1,048,576) have been written in the current sheet: | ||
```php | ||
use Box\Spout\Writer\WriterFactory; | ||
use Box\Spout\Common\Type; | ||
|
||
$writer = WriterFactory::create(Type::ODS); | ||
$writer->setShouldCreateNewSheetsAutomatically(true); // default value | ||
$writer->setShouldCreateNewSheetsAutomatically(false); // will stop writing new data when limit is reached | ||
``` | ||
|
||
#### Using custom temporary folder | ||
|
||
Processing XLSX and ODS files require temporary files to be created. By default, Spout will use the system default temporary folder (as returned by `sys_get_temp_dir()`). It is possible to override this by explicitly setting it on the reader or writer: | ||
```php | ||
use Box\Spout\Writer\WriterFactory; | ||
use Box\Spout\Common\Type; | ||
|
||
$writer = WriterFactory::create(Type::XLSX); | ||
$writer->setTempFolder($customTempFolderPath); | ||
``` | ||
|
||
#### Strings storage (XLSX writer) | ||
|
||
XLSX files support different ways to store the string values: | ||
* Shared strings are meant to optimize file size by separating strings from the sheet representation and ignoring strings duplicates (if a string is used three times, only one string will be stored) | ||
* Inline strings are less optimized (as duplicate strings are all stored) but is faster to process | ||
|
||
In order to keep the memory usage really low, Spout does not optimize strings when using shared strings. It is nevertheless possible to use this mode. | ||
```php | ||
use Box\Spout\Writer\WriterFactory; | ||
use Box\Spout\Common\Type; | ||
|
||
$writer = WriterFactory::create(Type::XLSX); | ||
$writer->setShouldUseInlineStrings(true); // default (and recommended) value | ||
$writer->setShouldUseInlineStrings(false); // will use shared strings | ||
``` | ||
|
||
> ##### Note on Apple Numbers and iOS support | ||
> | ||
> Apple's products (Numbers and the iOS previewer) don't support inline strings and display empty cells instead. Therefore, if these platforms need to be supported, make sure to use shared strings! | ||
|
||
#### Date/Time formatting | ||
|
||
When reading a spreadsheet containing dates or times, Spout returns the values by default as DateTime objects. | ||
It is possible to change this behavior and have a formatted date returned instead (e.g. "2016-11-29 1:22 AM"). The format of the date corresponds to what is specified in the spreadsheet. | ||
|
||
```php | ||
use Box\Spout\Reader\ReaderFactory; | ||
use Box\Spout\Common\Type; | ||
|
||
$reader = ReaderFactory::create(Type::XLSX); | ||
$reader->setShouldFormatDates(false); // default value | ||
$reader->setShouldFormatDates(true); // will return formatted dates | ||
``` | ||
|
||
### Playing with sheets | ||
|
||
When creating a XLSX or ODS file, it is possible to control which sheet the data will be written into. At any time, you can retrieve or set the current sheet: | ||
```php | ||
$firstSheet = $writer->getCurrentSheet(); | ||
$writer->addRow($rowForSheet1); // writes the row to the first sheet | ||
|
||
$newSheet = $writer->addNewSheetAndMakeItCurrent(); | ||
$writer->addRow($rowForSheet2); // writes the row to the new sheet | ||
|
||
$writer->setCurrentSheet($firstSheet); | ||
$writer->addRow($anotherRowForSheet1); // append the row to the first sheet | ||
``` | ||
|
||
It is also possible to retrieve all the sheets currently created: | ||
```php | ||
$sheets = $writer->getSheets(); | ||
``` | ||
|
||
If you rely on the sheet's name in your application, you can access it and customize it this way: | ||
```php | ||
// Accessing the sheet name when reading | ||
foreach ($reader->getSheetIterator() as $sheet) { | ||
$sheetName = $sheet->getName(); | ||
} | ||
|
||
// Accessing the sheet name when writing | ||
$sheet = $writer->getCurrentSheet(); | ||
$sheetName = $sheet->getName(); | ||
|
||
// Customizing the sheet name when writing | ||
$sheet = $writer->getCurrentSheet(); | ||
$sheet->setName('My custom name'); | ||
``` | ||
|
||
> Please note that Excel has some restrictions on the sheet's name: | ||
> * it must not be blank | ||
> * it must not exceed 31 characters | ||
> * it must not contain these characters: \ / ? * : [ or ] | ||
> * it must not start or end with a single quote | ||
> * it must be unique | ||
> | ||
> Handling these restrictions is the developer's responsibility. Spout does not try to automatically change the sheet's name, as one may rely on this name to be exactly what was passed in. | ||
Finally, it is possible to know which sheet was active when the spreadsheet was last saved. This can be useful if you are only interested in processing the one sheet that was last focused. | ||
```php | ||
foreach ($reader->getSheetIterator() as $sheet) { | ||
// only process data for the active sheet | ||
if ($sheet->isActive()) { | ||
// do something... | ||
} | ||
} | ||
``` | ||
|
||
### Fluent interface | ||
|
||
Because fluent interfaces are great, you can use them with Spout: | ||
```php | ||
use Box\Spout\Writer\WriterFactory; | ||
use Box\Spout\Common\Type; | ||
|
||
$writer = WriterFactory::create(Type::XLSX); | ||
$writer->setTempFolder($customTempFolderPath) | ||
->setShouldUseInlineStrings(true) | ||
->openToFile($filePath) | ||
->addRow($headerRow) | ||
->addRows($dataRows) | ||
->close(); | ||
``` | ||
|
||
|
||
## Running tests | ||
|
||
On the `master` branch, only unit and functional tests are included. The performance tests require very large files and have been excluded. | ||
|
@@ -355,39 +41,9 @@ For information, the performance tests take about 30 minutes to run (processing | |
> Performance tests status: [](https://travis-ci.org/box/spout) | ||
|
||
## Frequently Asked Questions | ||
|
||
#### How can Spout handle such large data sets and still use less than 3MB of memory? | ||
|
||
When writing data, Spout is streaming the data to files, one or few lines at a time. That means that it only keeps in memory the few rows that it needs to write. Once written, the memory is freed. | ||
|
||
Same goes with reading. Only one row at a time is stored in memory. A special technique is used to handle shared strings in XLSX, storing them - if needed - into several small temporary files that allows fast access. | ||
|
||
#### How long does it take to generate a file with X rows? | ||
|
||
Here are a few numbers regarding the performance of Spout: | ||
|
||
| Type | Action | 2,000 rows (6,000 cells) | 200,000 rows (600,000 cells) | 2,000,000 rows (6,000,000 cells) | | ||
|------|-------------------------------|--------------------------|------------------------------|----------------------------------| | ||
| CSV | Read | < 1 second | 4 seconds | 2-3 minutes | | ||
| | Write | < 1 second | 2 seconds | 2-3 minutes | | ||
| XLSX | Read<br>*inline strings* | < 1 second | 35-40 seconds | 18-20 minutes | | ||
| | Read<br>*shared strings* | 1 second | 1-2 minutes | 35-40 minutes | | ||
| | Write | 1 second | 20-25 seconds | 8-10 minutes | | ||
| ODS | Read | 1 second | 1-2 minutes | 5-6 minutes | | ||
| | Write | < 1 second | 35-40 seconds | 5-6 minutes | | ||
|
||
#### Does Spout support charts or formulas? | ||
|
||
No. This is a compromise to keep memory usage low. Charts and formulas requires data to be kept in memory in order to be used. | ||
So the larger the file would be, the more memory would be consumed, preventing your code to scale well. | ||
|
||
|
||
## Support | ||
|
||
Need to contact us directly? Email [email protected] and be sure to include the name of this project in the subject. | ||
|
||
You can also ask questions, submit new features ideas or discuss about Spout in the chat room:<br> | ||
You can ask questions, submit new features ideas or discuss about Spout in the chat room:<br> | ||
[](https://gitter.im/box/spout?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge) | ||
|
||
## Copyright and License | ||
|
Oops, something went wrong.