'Limit results of a joined table to the number of entries in the base table

I use Lansweeper to keep track of all server assets. At the moment I am working on a report, which shows which webservers are installed and running. For this I have to join to tables, Table A contains the installed webserver software, table B contains the names of the services

The expected result should like this:

VM Version Class Servicename
VM1 9 Tomcat tomcat9
VM2 9 Tomcat tomcat9
VM2 7 Tomcat tomcat7
VM3 9 Tomcat tomcat9

It is working fine, except for servers where multiple webservers are installed. If I have two webservers on a machine I get 4 results in the table. This is because I join based on the VM-ID, so the result looks like this:

VM Version Class Servicename
VM1 9 Tomcat tomcat9
VM2 9 Tomcat tomcat9
VM2 7 Tomcat tomcat7
VM2 9 Tomcat tomcat7
VM2 7 Tomcat tomcat9
VM3 9 Tomcat tomcat9

This is the SQL statement for this result:

select a.VM, a.Version, a.Class, b.Servicename
from Installedsoftware as a
inner join Services as b on a.VM = b.VM

I totally understand that this happens because of two entries in table A und table B which results in 4 entries because of the join.

Is there a way to avoid this without adding a second join condition? Because this would only help in this scenario but not if I have two tomcat9 servers on the same machine. Is there a way to limit the results in a way like "if Table A has only two entries for VM1, then show only two lines of the joined table"?



Solution 1:[1]

This seems like a groupwise-maximum problem, which has several potential approaches, but all encourage a predictable, deterministic row as opposed to "I don't care which row." One way is:

select a.VM, a.Version, a.Class, b.Servicename
from dbo.Installedsoftware as a
left outer join 
(
  SELECT VM, Servicename, rn = ROW_NUMBER() OVER
    (PARTITION BY VM ORDER BY Servicename)
  FROM dbo.Services
) as b on a.VM = b.VM
AND b.rn = 1;

You could get a little closer to "I don't care" by changing:

    (PARTITION BY VM ORDER BY @@SPID)

Or to a pseudo-random row (at a performance cost) using:

    (PARTITION BY VM ORDER BY NEWID())

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Aaron Bertrand