Posting HTML forms with special characters, while keeping your database clean.

The best practice when storing data in a database is to store it in its most purest form.
When allowing users to edit data through HTML webpages you need to encode some characters so your HTML-forms won’t break. You can do this by using htmlspecialchars (or htmlentities). Below is an example with htmlspecialchars where only the double quotes are escaped (ENT_COMPAT flag).

Mind the accept-charset in the form value: I try to work with UTF-8 and UTF-8 only. (see your collation in mySQL is also set to UTF-8!)

You can run this on code on localhost (e.g. XAMPP)

<?php
header("Cache-Control: no-cache, must-revalidate"); // HTTP/1.1
header("Expires: Sat, 26 Jul 1997 05:00:00 GMT"); // Date in the past
?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>htmlspecialchars (utf-8 encoding)</title>
</head>
<body>
<h1>htmlspecialchars (utf-8 encoding)</h1>
<h2>Before form post</h2>
<?php
$dbvalue = "String: ? < > ' - \" `´& % ‰ € ® 2011";
$formvalue = htmlspecialchars($dbvalue, ENT_COMPAT,"UTF-8");
?>
<p><strong>String</strong> is a value coming from a database record in its cleanest form: <span style="color:green;"><?php echo htmlspecialchars($dbvalue); ?></span> </p>
<p>For use in a text form, especially the double quotes, must be encoded so the <em>value=&quot;&quot;</em> doesn't break. We use <strong>htmlspecialchars</strong> (ENT_COMPAT) function. ENT COMPAT only forces double quotes to be changed into &amp;quot; (besides < > ? &)</p>
<form action="<?php $_SERVER['PHP_SELF']; ?>" method="post" accept-charset="UTF-8">
	<label>String:
	<input name="string" type="text" value="<?php echo $formvalue; ?>" size="50" /></label>
	<br />
	The value of title inside this form looks like <span style="color:red;"><?php echo htmlspecialchars($formvalue); ?></span><br />
	<input name="submit" type="submit" value="submit this form" />
</form>
<?php if($_POST){ ?>
<h2>Yes, the form was posted</h2>
<p>When the form is <strong>submitted</strong>, the <strong>string</strong> field will again have a value in its purest form (no &amp;quot; values but &quot;) (not the htmlspecialchars formatting)</p>
<p><strong>String</strong> has submitted value: <span style="color:green;">
	<?php echo $_POST['string']; ?>
	</span></p>
<?php } ?>
</body>
</html>

XHTML: some good practices

Some good practices on XHTML

  • <doctype> always use a doctype before <html>-tag: don’t get in quirksmode –recommended list of doctypes (W3 schools)
    • Transitional is most common
    • Strict: prohibits the use of target=”_blank” and a few other HTML tags, like s, center, strike, u, applet, iframe, font, isindex, dir, basefont
  • <h1> Only one h1 per page: use for the main topic
  • <blockquote> Only to refer to bits of text coming from other locations
  • <q> Use for a single ‘quote’
  • <table> Never use for lay-out purposes
  • <cite> References to books, websites, articles
  • <address> identify contact information to the author of a page

Other usefull tips

Disallow compatibility mode

Tell Internet Explorer never to go into compatibilty mode (use an older IE engine than the one installed). Put this after the <title>-tag:


Linking style sheets

Use <link> for simplicity and better performance


Validate your code

http://validator.w3.org

Forget about HTML attributes like

  • vspace
  • hspace
  • bgcolor
  • text
  • link
  • alink
  • vlink
  • img align: use CSS-floats instead

Low level format browser specific attributes like

  • leftmargin
  • topmargin
  • marginwidth
  • marginheight