Most of the web sites today have some sort of a registration module where a user is asked to choose a username/password combination. This data gets stored in the database. You might wonder if the password you provide will be kept well-protected (read encrypted). In case you are the person designing such backend registration component, why not give your users peace of mind by encrypting their passwords?
This scenario is a perfect candidate for "one-way hash encryption" also known as a message digest, digital signature, one-way encryption, digital fingerprint, or cryptographic hash. It is referred to as "one-way" because although you can calculate a message digest, given some data, you can't figure out what data produced a given message digest. This is also a collision-free mechanism that guarantees that no two different values will produce the same digest. Another property of this digest is that it is a condensed representation of a message or a data file and as such it has a fixed length.
There are several message-digest algorithms used widely today.
| Algorithm | Strength |
|---|---|
| MD5 | 128 bit |
| SHA-1 | 160 bit |
SHA-1 (Secure Hash Algorithm 1) is slower than MD5, but the message digest is larger, which makes it more resistant to brute force attacks. Therefore, it is recommended that Secure Hash Algorithm is preferred to MD5 for all of your digest needs. Note, SHA-1 now has even higher strength brothers, SHA-256, SHA-384, and SHA-512 for 256, 384 and 512-bit digests respectively.
Here is a typical flow of how our message digest algorithm can be used to provide one-way password hashing:
1) User registers with some site by submitting the following data:
| username | password |
|---|---|
| jsmith | mypass |
2) before storing the data, a one-way hash of the password is created: "mypass" is transformed into "5yfRRkrhJDbomacm2lsvEdg4GyY="
The data stored in the database ends up looking like this:
| username | password |
|---|---|
| jsmith | 5yfRRkrhJDbomacm2lsvEdg4GyY= |
3) When jsmith comes back to this site later and decides to login using his credentials (jsmith/mypass), the password hash is created in memory (session) and is compared to the one stored in the database. Both values are equal to "5yfRRkrhJDbomacm2lsvEdg4GyY=" since the same password value "mypass" was used both times when submitting his credentials. Therefore, his login will be successful.
Note, any other plaintext password value will produce a different sequence of characters. Even using a similar password value ("mypast") with only one-letter difference, results in an entirely different hash: "hXdvNSKB5Ifd6fauhUAQZ4jA7o8="
| plaintext password | encrypted password |
|---|---|
| mypass | 5yfRRkrhJDbomacm2lsvEdg4GyY= |
| mypast | hXdvNSKB5Ifd6fauhUAQZ4jA7o8= |
As mentioned above, given that strong encryption algorithm such as SHA is used, it is impossible to reverse-engineer the encrypted value from "5yfRRkrhJDbomacm2lsvEdg4GyY=" to "mypass". Therefore, even if a malicious hacker gets a hold of your password digest, he/she won't be able determine what your password is.
Let's assume that you are writing a web application to be run in a servlet container. Your registration servlet might have the following portion (for clarity, I ommitted input validation steps and assume that a password value was passed in within the password form input field):
[...]
public void doPost(HttpServletRequest request, HttpServletResponse response)
{
User user = new org.myorg.registration.User();
user.setPassword(org.myorg.services.PasswordService.getInstance().encrypt(request.getParameter("password"));
[...]
Here is the definition of my PasswordService class that does the job of generating a one-way hash value:
package org.myorg.services;
import java.io.UnsupportedEncodingException;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
import org.myorg.SystemUnavailableException;
import sun.misc.BASE64Encoder;
import sun.misc.CharacterEncoder;
public final class PasswordService
{
private static PasswordService instance;
private PasswordService()
{
}
public synchronized String encrypt(String plaintext) throws SystemUnavailableException
{
MessageDigest md = null;
try
{
md = MessageDigest.getInstance("SHA"); //step 2
}
catch(NoSuchAlgorithmException e)
{
throw new SystemUnavailableException(e.getMessage());
}
try
{
md.update(plaintext.getBytes("UTF-8")); //step 3
}
catch(UnsupportedEncodingException e)
{
throw new SystemUnavailableException(e.getMessage());
}
byte raw[] = md.digest(); //step 4
String hash = (new BASE64Encoder()).encode(raw); //step 5
return hash; //step 6
}
public static synchronized PasswordService getInstance() //step 1
{
if(instance == null)
{
instance = new PasswordService();
}
return instance;
}
}
The method of interest here is encrypt(). I chose to make this class a singleton in order to ensure that there is only one instance of it at any given time to avoid concurrency issues and conflicts between generated hash values. For an explanation of this design pattern, try a google search for "java singleton pattern".
Let's step through the code above to see what's going on:
step 1: The registration servlet will interface with our PasswordService class using this static getInstance() method. Whenever it is invoked, a check will be made to see if an instance of this service class already exists. If so, it will be returned back to the caller (registration servlet). Otherwise, a new instance will be created.
step 2: We are asking Java security API to obtain an instance of a message digest object using the algorithm supplied (in this case, SHA-1 message digest algorithm will be used. Both SHA and SHA-1 refer to the same thing, a revised SHA algorithm). Sun JDK includes JCA (Java Cryptography Architecture) which includes support for SHA algorithm. If your environment does not support SHA, NoSuchAlgorithmException will be thrown.
step 3: Feed the data:
a) convert the plaintext password (eg, "jsmith") into a byte-representation using UTF-8 encoding format.
b) apply this array to the message digest object created earlier. This array will be used as a source for the message digest object to operate on.
step 3: Do the transformation: generate an array of bytes that represent the digested (encrypted) password value.
step 4: Create a String representation of the byte array representing the digested password value. This is needed to be able to store the password in the database. At this point, the hash value of the plaintext "jsmith" is "5yfRRkrhJDbomacm2lsvEdg4GyY=".
step 5: Return the String representation of the newly generated hash back to our registration servlet so that it can be stored in the database. The user.getPassword() method now returns "5yfRRkrhJDbomacm2lsvEdg4GyY="
That's all. Your database password data is now encrypted and if an intruder gets a hold of it, he/she won't have much use of it. Note, you have to consider how you will handle "forgot password" functionality in this case as you now cannot simply send a password to the user's email address. (Well, you should not be doing things like that anyway) . Sounds to me like a perfect topic for my next article.
Comments
Lost passwords
You have other options of course, for instance generating a new password when one is forgotten, but you'll then have to be careful that no DoS is possible using this mechanism, which further complexifies the setup. Or you can ask the user to re-validate his email when setting the new password. That's all added complexity that many sites won't want to go through.
re: Lost passwords
- opaque Id: i like to store two id's in user database: one is their username and the other is a random opaqueId that gets generated during registration and is stored in the user table.
- email verification: verify user's email address during registration (send him/her a confirmation email and ask to visit a link). the opaqueId gets sent on the query string within the verification email.
- forgot password?: when user forgets a password, he enters his username and gets only if the email was previously verified in step 1. if so, the email sent to the user contains a link that includes an opaqueId on the query string. when user visits the url, he/she is auto-logged in and is presented with a "choose new password" form. note, you can visit this url only once. it expires immediately after you that orit can expire if it is not visited after a specific period (such as 1 hour).
So, I agree. These steps are way more than most sites do to secure user passwords. But would anyone like their password to be sent to their email address if they use a non-ssl mail client? What if you happen to use this password with other important (such as banking) sites?correction to my prev. comment
re: Lost Passwords
a link vs. your password in your inbox
- If you reuse the same password at several sites, and somebody gets a hold of your password for one of the sites, you are in trouble. Using a link that is only applicable to that particular site can only be compromized on that one site
- Your password does not expire. Chances are when you get a plaintext password in your email, you keep using it and don't always delete the original message. This could be dangerous.. The solution I described is designed so that the password help link expires as soon as it is visited or as soon as a specific time period that you set expires, whichever comes first.
These are just two points I could think of at the time. I am sure there are more..re: insecure inboxes
>> email and visit the URL before they do?
possible, I suppose. The system I use creates an md5 hash, and compares an md5 encrypted version submitted password and the stored hash as referenced in the article.
the way I deal with lost passwords is by sending in the email a hashed email address and hashed password then check these against whats in the db and then have the user input a new password, etc.
there's no way anyone scamming an inbox would know what the account NAME was, so they'd be unable to use it...
does that make sense?
re: insecure inboxes
the way I deal with lost passwords is by sending in the email a hashed email address and hashed password then check these against whats in the db and then have the user input a new password, etc.
This makes sense to me. What do you mean when you say what the account NAME was ?
I guess I am not a big fan of sending someone's password over the internet (even when it's hashed, especially given the fact that md5 is not the most secure hash algorithm). But I could be too paranoid about it.
re: insecure inboxes
re: insecure inboxes
re: insecure inboxes
right - no finding users by their email, and no multiple users on one email account... i actually find that more secure ... and easier to track users...
do you know if PHP supports SHA - I mostly use PHP for sites, and md5 is a native function, but SHA does sound more secure... nice job on the article BTW
re: insecure inboxes
sha1
re: sha1
re: sha1
yeah - thx
Longest encrypted return value?
re: Longest encrypted return value?
really? no duplicate hashes?
re: no duplicate hashes?
Speed?
The Single Quote Problem
HTTP and security of password
Re: HTTP and security of password
hth, Chris.
Re: HTTP and security of password
If Some Passwords in DB are encrypted and some are Non encrypted