Why I Can't Wait For PHP 6: Part 1

10 Jun 2008 | charset, PHP, UTF-8

Today I decided to take the initiative to add UTF-8 support to our Item Import and Update tools. Easier said than done. I have had my share of Unicode issues with PHP, as I am sure everyone has. This is the first one that I have not been able to conquer.

Our tools use an uploaded CSV file to both import new and update existing items in the WorkXpress application. The file can be in either a comma or tab delimited list formats. The Unicode problem arises when the uploaded file contains any UTF-8 characters. We use fopen to open the files and fgetcsv to parse the file. However, fgetcsv does not support UTF-8 characters. After an hour of play, I could not get any function to read the UTF-8 characters properly, not even file_get_contents.

For my test, I used a three line file I called utf_import.csv. The file looked similar to the following:

"cafe Good","bold1"
"café Not So Good","bold2"
"cafae Okay I Guess","bold3"

However, I received the following results:

  0 => string 'cafe Good'
  1 => string 'bold1'

  0 => string 'caf� Not So Good'
  1 => string 'bold2'

  0 => string 'cafae Okay I Guess'
  1 => string 'bold3'

The Internet returned little help. I found several post suggesting to use setlocale(LC_ALL, 'en_US.UTF-8'); which would make sense based on this note in the fgetcsv documentation:

Note: Locale setting is taken into account by this function. If LANG is e.g. en_US.UTF-8, files in one-byte encoding are read wrong by this function.

Unfortunately, even this does not work. Some time later, I came across PHP Bug #38471: fgetcsv(): locale dependency of delimiter / enclosure arg. The response to this bug:

We’re working Unicode support in PHP6. but it won’t appear in previous versions.

comments powered by Disqus

Older · View Archive (34)

The Browser Is Mightier Than The Rich Text Area

Since Firefox 3 Beta 5 was released, there has been some problems with our Rich Text Areas in WorkXpress. The control renders but remains gray; there is no way for users to enter text. This problem persist in Firefox 3 Release Candidate 1.


JQuery hasEvent Version 1.0.0 Released

Today I released my first plugin for jQuery, hasEvent. This plugin takes a selector and returns true if the element contains an action on that event.