As others mentioned, keeping the camera square to the subject will eliminate converging verticals (also called keystoning). We see converging verticals when lines that are | | in reality render as / \ . This has nothing to do with lens quality. All you can do is tilt the camera until the converging vertical edges render as lines. In many cases (tall buildings) this is impractical. You can't get the camera high enough and still frame the subjects of interest.
The X-Pro 1 has an electronic level that helps eliminate horizontal tilt. You can check for vertical tilt by inspection using the optional grid overlay in the EVF display. I'm not sure the converging verticals are easy to detect in the OVF.
Most post-production software will correct for converging verticals. However this results in a crop. The larger the correction the more the frame becomes cropped. Im my experience correction converging verticals is very easy when there is no horizontal tilt. Automate correction works well in Adobe products. But when both planes are off, proper corrections can be tedious.
Kudos to Shawn
for mentioning a volume deformation. This always
occurs because it impossible to project three-dimensional objects into a two-dimensional plane without error. So lens quality is not a factor. Volume deformation is often observed when spherical objects render as ellipsoids or square objects render as rectangles. When you are close to objects and, or they are at the frame edges, volume deformation is most obvious.
Newer versions of Photoshop will minimize volume deformation (link