This page is just a loosely
organized collection of notes on using tools
like netcat in some common tasks.
| |
Overview
netcat is tool that adapts TCP and UDP network connections
for use in common Unix pipelines, much like the
ubiquitous
cat utility.
It can be used to connect to a remote server and send any
data from the net to its standard output.
It can also be used to create short-term short-lived
network services that others can connect to.
netcat is a popular tool, but it isn't the only one.
Here's a list; pick and choose.
The rest of this page is written for
netcat but you
should be able to adapt the examples to whatever you
choose here. I might supply
ucspi-tcp examples on
this page if I'm asked to (it, and
netcat, are the
tools that I use daily... for the other tools, you're
on your own).
One way of thinking about these kinds of tools is that they
allow you to use network connections in your shell scripts.
Another way of thinking about them is that they let you
insert a network connection into your pipelines that connect
commands together.
Try to keep both in mind...
Moving Files
First Approach
Okay, first up: moving hierarchies of files around.
This is a common task, and one that we've done many many
times before.
rcp,
scp,
rsync... the list of tools that can do
this goes on and on.
You've got a hierarchy of files on machine
alpha and
you want to move them to machine
beta.
No problem, you just log onto
alpha and run
something like
scp -r beta:/from/here /to/here and
you're finished. Great.
One problem, though, is that they often require that you
have access to machines on both sides of the network.
For example, in order for that previous
scp to work,
you have to have some kind of access (by key or password)
to an account on
beta.
What if you don't have an account on
beta?
Someone has to give you one, or they have to give you access
to an existing account (theirs?) on
beta.
Site security policies can make this nontrivial, as you
can imagine.
Let's say, instead, that someone with an account on
beta
wants to give you a hierarchy of files.
Instead of archiving them up (taking up disk space) and
dropping them on a web server or an FTP account,
here's an alternative approach.
Let's make an archive of all those files on the fly, not
making copies anywhere, and instead just send the archive
right over the network to your system.
beta$ cd /from/here
beta$ tar cf - . |nc -l -p 54321
Let's consider what that's doing.
First, we get into the tree of files that we want to copy.
Then, we use
tar to create an archive of them, and we
send that stream of data to
nc.
The netcat is told to listen for an incoming connection
on port 54321.
That's it.
When
tar has no more data to send, it'll close its
pipe to netcat, who in turn will close its network connection
to the client, all exiting cleanly.
No intermediate storage was required,
no web server or FTP site was involved, and when you're
done, there's nothing to clean up.
Over on your side of the network, run something like this.
alpha$ cd /to/here
alpha$ nc beta 54321 |tar xf -
Here, you're entering the destination for this hierarchy
of files.
Then, netcat is used to make a connection to
beta
at port 54321.
All data seen on this connection is piped right into
tar.
When the connection closes, netcat will close its pipe to
tar, who will exit cleanly.
And now there's a whole new hierarchy of files in the local
filesystem.
Another Way Of Looking At It
Consider this. If we had no network to deal with, one way
of moving files could be a command like the following.
foo$ ( cd /from/here; tar cf - . ) | ( cd /to/here; tar xf - )
A silly thing to do would be to insert a
cat utility into the
mix.
foo$ ( cd /from/here; tar cf - . ) | cat | ( cd /to/here; tar xf - )
But a not so silly thing to do would be to imagine a network
connection instead of
cat.
foo$ ( cd /from/here; tar cf - . ) | *NETWORK* | ( cd /to/here; tar xf - )
Splitting it up along the NETWORK line would make it look like this.
beta$ ( cd /from/here; tar cf - . ) | *NETWORK*
alpha$ *NETWORK* | ( cd /to/here; tar xf - )
And now, just insert the endpoints of a network connection.
beta$ ( cd /from/here; tar cf - . ) | nc -l -p 54321
alpha$ nc beta 54321 | ( cd /to/here; tar xf - )
Going In Reverse
So, in the previous example, we
connected to a port that
sent us
a hierarchy of files as a data stream.
Do we always have to go that way? Of course not!
Imagine, for example, that
beta was behind a firewall.
Your friend can make outbound connections just fine, but
you can't connect to him easily.
How to get around this? Reverse the connection. Let
beta
be the connection maker, and make
alpha the listener.
alpha$ cd /to/here
alpha$ nc -l -p 54321 |tar xf -
beta$ cd /from/here
beta$ tar cf - . |nc alpha 54321
That's it. All that's different here is which end of the network
connection was the "server"; that is to say, which end was listening
for a new connection.
The data still moved in the same direction, and the same data was
moved.
However it would be irresponsible of me not to mention that
this secondary approach is slightly less secure. In the first
example, anyone on the network (not just you!) could connect to
beta (if they knew we were using port 54321) and get the copy
of the files meant for you.
In this second example, however, anyone could connect to you and
send you
malicious data which you'd then copy right into your
filesystem via
tar.
Certainly, the odds of someone finding this port open for the few seconds
you're running it are tiny. But, there is a chance it could
happen, and so you should be warned: run services like this for
only a few seconds, and only "by hand". Automating services like
this are better left to more secure methods like
scp that can
involve key management and other overhead.
Compression
So, you're moving your files around.
beta$ cd /from/here; tar cf - . |nc -l -p 12345
alpha$ cd /to/here; nc beta 12345 |tar xf -
You realize, though, that those files are mostly text, and would
compress really well.
Since the bottleneck for what we're doing is almost certainly
the network, compressing the data that goes over the wire would
be great.
What we'd like is to turn our existing pipeline:
Into a pipeline of compressed data, like this:
This is trivial. Let's use the
gzip tool, which when
used in a pipeline, will process its standard input
and write the compressed data onto its standard output.
It comes with a tool
gunzip that works the same way,
decompressing its input and writing the original data on
output.
Take your original commands,
beta$ cd /from/here; tar cf - . |nc -l -p 54321
alpha$ cd /to/here; nc beta 54321 |tar xf -
And just insert
gzip and
gunzip right before and after the
network data stream.
beta$ cd /from/here; tar cf - . |gzip |nc -l -p 12345
alpha$ cd /to/here; nc beta 12345 |gunzip |tar xf -
Image Processing
With all the focus on moving hierarchies of files around, it's
easy to lose track of the fact that you can move any data around
the network with these tools.
Let's say you've got two systems at your disposal.
mega is a big server somewhere with tons of data on it,
and
micro is your laptop.
You want to do some image processing, perhaps synthesizing
a new image from 128 GB of raw image data stored locally on
mega, and you want to display the results locally on
micro.
For starters, consider the image generation steps.
The following pipeline is entirely hypothetical, but might
be used to locate all the multispectral infrared images stored in an archive somewhere,
name them to a geo-rectification tool which generates
a sequence of geo-rectified images on its output.
This image sequence can then be scanned for forest fires based
on what's found in the various images.
mega$ find /wasp/archive/2005-07-11 -name '*IR*img' |georect -f - |firescan >/var/tmp/firemap.tiff
Lets say that, instead of copying the image over from
mega to
micro
every time, you'd just like to get the data immediately.
Okay, so add a network pipe to the connection.
Here, we'll use the ImageMagick utility
display, instructing it to
read an image from standard input for local display.
micro$ nc -l -p 12345 |display -
mega$ find /wasp/archive/2005-07-11 -name '*IR*img' |georect -f - |firescan |nc micro 12345
I know what you're thinking.
"Oh, but now the image is lost once it's displayed!
I'll have to recreate it every time!"
No problem; just add a
tee command to the pipeline.
This will save a copy of the data to a local file on
micro
before passing it along to the next utility in the pipeline.
micro$ nc -l -p 12345 |tee firemap.tiff |display -
mega$ find /wasp/archive/2005-07-11 -name '*IR*img' |georect -f - |firescan |nc micro 12345
More
More examples here...