Secure Sockets Layer is the standard method of securing Web transactions. The mathematical computations necessary for SSL cryptography involve very large numbers and math functions not within the instruction set of most commercial processors.
These computations typically are done in software, creating a tremendous burden for servers; typically a two orders of magnitude performance decrease is observed. A server capable of processing 1,000 transactions per second can process only 10 transactions per second when they are all SSL-protected.
The traditional solution to the performance problem is to buy more servers with multiple CPUs to handle the secure traffic and ration secure Web pages to the minimum, critical set of transactions, such as credit card number exchange.
An alternative approach is to accelerate the SSL cryptography with coprocessors. These products perform RSA encryption or bulk encryption, or both; all still depend on a host processor (or the network processor) to send and receive SSL records to the cryptography chip.
They process up to thousands of new SSL handshake requests per second. However, their approaches require substantial "glueware" to support the cooperative processing between the cryptography hardware and the host processor, and most still rely on the PCI (or PCI-X) bus to convey data between the cryptography chips and the host. This architecture increases complexity, and introduces performance bottlenecks because multiple exchanges between cryptography processor and host CPU take place to process even one SSL session handshake.
System on a chip
One solution is to place an entire system on a chip to perform the traffic classification, the entire SSL protocol and all bulk encryption.
This removes any interaction with the host CPU, reducing complexity and significantly improving performance. A security processor on a chip presents an industry-standard Gigabit Ethernet interface to the client side and another one to the server side.
One of the most intimidating hurdles to terminating large numbers of SSL sessions is processing the TCP/IP packets that encapsulate the SSL records. A gigabit of TCP/IP traffic alone will bury a traditional CPU, without ever setting up an SSL session.
The new chips integrate a high-performance TCP/IP processor that, for SSL traffic, handles TCP segmentation, packet reordering and other protocol functions that can bog down the host. The client-side interface is a Gigabit Media Independent Interface (GMII) port, which would sit directly behind a network interface card's (NIC), or appliance's, physical interface.
Client HTTP traffic is passed directly through the chip to the server port, also a GMII interface. Incoming SSL traffic is routed to the cryptography section of the chip, which performs all the SSL protocol functions and bulk encryption, and grooms the resulting clear text messages before presenting them to the server port.
Grooming is key
The target server will experience no performance degradation between SSL and non-SSL traffic and, in fact, could experience relatively better performance with SSL traffic, because of the groomed nature of the TCP/IP packets presented to it.
An in-line SSL solution is not valuable unless it performs all its network and cryptography functions at wire speed, up to 1 gigabit per second throughput, full duplex. This translates to the ability to handle up to 100,000 new SSL handshake requests per second.
Configuration, loading of SSL key and certificate files, and management information retrieval functions, are performed via a management port, which is a GMII interface. All management functions and communications can be secured with SSL.
The need to secure more network traffic, and achieve wire-speed performance, demands a new approach to cryptography technologies. The in-line approach creates a highly manageable SSL solution that achieves performance and is easy to integrate with Web server NICs, SSL appliances and other Layer 4-7 devices.