Showing entries 1 to 3
Displaying posts with tag: encoding (reset)
Character Sets: Migrating to utf8mb4 with pt_online_schema_change

Modern applications often feature the use of data in many different languages. This is often true even of applications that only offer a user facing interface in a single language. Many users may, for example, need to enter names which, although using Latin characters, feature diacritics; in other cases, they may need to enter text which contains Chinese or Japanese characters. Even if a user is capable of using an application localized for only one language, it may be necessary to deal with data from a wide variety of languages.

Additionally, increased use of mobile phones has lead to changes in communications behaviour; this includes a vastly increased use of standardized characters intended to convey emotions, often called “emojis” or “emoticons.” Originally, such information was conveyed using ASCII text, such as “:-)” to indicate happiness – but, as noted, this has changed, with many devices automatically converting such …

[Read more]
Toward A Practical Perceptual Video Quality Metric

by Zhi Li, Anne Aaron, Ioannis Katsavounidis, Anush Moorthy and Megha Manohara

At Netflix we care about video quality, and we care about measuring video quality accurately at scale. Our method, Video Multimethod Assessment Fusion (VMAF), seeks to reflect the viewer’s perception of our streaming quality.  We are open-sourcing this tool and invite the research community to collaborate with us on this important project.
Our Quest for High Quality Video
We strive to provide our members with a great viewing experience: smooth video playback, free of annoying picture artifacts. A significant part of this endeavor is delivering video streams with the best perceptual quality possible, given the constraints of the network bandwidth and viewing device. We continuously work towards this goal through multiple efforts.
First, we innovate in the area of video encoding. Streaming video requires compression using …

[Read more]
Battling XHTML :: Storing UTF-8 data in MySQL

In the xml parser that I’ve been writing for rss/atom feeds I’ve encountered what many people have found; bizarre encoding issues when displaying the data from the database on a webpage. Since this is not really well explained by the searches I did on google I’ll explain it here.

Issue: you have utf-8 data coming from a source, you put it into a utf8_general_ci column of a mysql database table. You read the data from the database and display it as html/xhtml. Instead of getting things like double backquotes or long dashes you get euro signs or umlaut type of characters, usually strings of them instead of the correct format.

Potential solution: use utf8_encode and htmlentities in PHP to clean the data before going into the database. This does not work. Why? Those characters are not covered by html standards since they are above ascii code 126. See here for the full code chart: …

[Read more]
Showing entries 1 to 3