Connect ed2015 bp109_sametime voice and video in the real world
-
Upload
a8us -
Category
Technology
-
view
199 -
download
1
Transcript of Connect ed2015 bp109_sametime voice and video in the real world
Bigger On the Inside When It's Working Troubleshoot It When It's Fixed Make It Mobile Bigger On the Outside When It's Resilient, Break It When It's Secure Hack It Lessons Learned Outside the Box
Bigger on the Inside
More to unlock - which you have already paid for - than you may have previously thought
Simple per-user licensing
No additional software cost to add voice and video
No additional software cost to cluster for scale and reliability
Mobile device access is always included
Bigger on the Inside - Sametime AV Product Options
“Sametime Audio Video” : Connect Client to Client AV “calls”, click user (no number)
“Sametime Voice” / “ST ” / “SUT-Lite” : Client calls to/from Phones and external Video System/Clients – by numbers/SIP URIs (sip:...@...)
Android / iOS Mobile Clients provide connectivity for the Mobile User
Sametime Meetings offers a zero-download AV browser client
Sametime Video Manager/MCU will talk to any/all such clients for Conferencing
(Full-Fat) Sametime Unified Telephony : phone control (full telephony)
All of the above uses SIP, SDP and RTP – for more details see last year’s presentation
http://www.slideshare.net/kbmsg/jmp206
Com
municate
Conference
CO
MP
LET
E
SU
T
Crash Recap of Voice, Video, Conferencing and Telephony Terminology
SIP - Session Initiation Protocol: standard for making calls (sessions) between endpoints using INVITEs, endpoints which may move typically REGISTER first
SDP - Session Description Protocol: standard for describing audio/video/etc sessions
(S) RTP - (Secure) Real-time Transport Protocol: standard for sending/receiving audio/video/etc in packets
Codec - standard for packaging audio/video – G.711 is telephone quality voice, G.729 (patented/licensed) and iLBC (open source/free) are highly compressed audio
MCU - Multipoint Control Unit: audio/video mixer for conference calls
TLS - Transport Layer Security: encryption standard providing secure communications Early versions of TLS were called SSL (Secure Socket Layer).
Do you want to cut costs by reducing phone handsets?
– Does Sametime Voice and Video therefore need to be as reliable as your PBX?
– Can ST fit into your dialplan?
Or cut costs by centralizing external calls?
– Watch out for internal billing issues as well as regulatory restrictions
– If you want to keep external calls routing out each site configuration it is very complex without SUT
Or cut costs by using internal conferencing?
Or simply Improve Collaboration?
KickOff: Consider Your Raison D’Etre for Sametime Voice and Video
KickOff: When you think you know what to do, Assume you don’t!
Hold at least one full day workshop with all parties - including decision makers - to
– Discuss functional as well as non-functional (scale, resilience, security) requirements
– Ensure everyone is aware of all the possibilities
Compile, document and plan to perform a comprehensive Test Plan
– Anything not tested is not guaranteed to work – therefore do and not just cover a few use cases
Plan a Pilot in an equivalent environment to production OR Plan a suitably sized Reference/Staging environment – clustered and secure if these will be used in production – reconfiguring and re-testing for complex issues in production is painful!
Bigger On the Inside When It's Working Troubleshoot It When It's Fixed Make It Mobile Bigger On the Outside When It's Resilient, Break It When It's Secure Hack It Lessons Learned Outside the Box
“When It’s Working, Troubleshoot It!”
Understanding the basic ways the calls flow
– Arm yourself for real troubleshooting
– Prepare your mind for the added complexities of clustering
ST Connect Client to Client Call Flow
Client
CS CM
SIPPR
BWM
Client
VP
VP
SIP
SIP
RTP
1
2
3
5
4
8
7
9
6
Client asks Conference Manager (CM) to set up a call via Virtual Places (VP) request to Community Server (CS) (1,2)
CM sends all SIP requests through SIP Proxy and Registrar (SIPPR) (3,6)
SIPPR may consult with Bandwidth Manager (BWM) – a B2BUA* which can modify SDP or deny call (4,7)
CM/SIPPR sends requests to Caller Client first (3,5) and then Called client (6,8)
Clients accept calls (200 OK) with media details in SIP SDPs - these flow through the above paths in ACKs and (re-)INVITEs, giving each client the details
Real Time Protocol (RTP) audio/video flows directly from client to client (9)
* B2BUA = SIP Back to Back User Agent, this is two SIP User Agents (UAs) combined: a User Agent Server (UAS) which receives a call and a separate User Agent Client (UAC) which initiates a new call based heavily on the original call but modified as required
Video Manager not involved even for Video Calls
Two “calls” without BWM, Four with it
SIPPR and BWM “see everything” EXCEPT conditions at/between the Clients
Conference Leg Call Flow
Client
CS CM
SIPPR
BWM
VP
VP
SIP
SIP
SIP
1
2
3
5
4
9
7 6
SIP
Client asks CM to set up calls via VP request to CS (1,2)
CM sends SIP requests to clients through SIPPR (3,5)
SIPPR consults with BWM if configured (4)
SIPPR sends request to Client and it accepts call (200 OK) (5)
CM sends call request direct to Video Manager (VMGR) (6)
VMGR sends new call request (like a B2BUA) to Video MCU (VMCU) via SIPPR and BWM if configured (7,8,9) – VMCU accepts call (200 OK), responding via port 15000
Confirmations (ACK) flow through the above paths, ultimately exchanging client AV details with the VMCU in the SIP SDPs
RTP AV flows between VMCU and clients (10)
There are, as a result of BWM and VMGR, 5 calls/sessions here and even SIPPR cannot see the entire set of calls/sessions (as CM talks directly to VMGR with call/session details different to the VMGR call to VMCU) – without BWM there would still be 3 calls, two of which SIPPR would not see as CM would send one call to VMGR and the VMGR would send a call with different call/session details to VMCU directly
Note: The CM, VMGR and VMCU also communicate with each other to ready conference bridges for use by means of XML over HTTPS/HTTP on various ports such as 8443, 9443, 443 and 8080
VMCU
8
VMGR
10
Three “calls” without BWM, Five with it
SIPPR “blind” to CM <-> VMGR
VMGR always involved
Bigger On the Inside When It's Working Troubleshoot It When It's Fixed Make It Mobile Bigger On the Outside When It's Resilient, Break It When It's Secure Hack It Lessons Learned Outside the Box
“When It’s Fixed, Make it Mobile!”
Mobile Client access is available for all ST Packages: Communicate, Conference and Complete
Traversing the public internet, DMZ, etc. securely adds significant complexity
HRPS
External/Mobile Clients (Best Practice)
Mobile
Client
SIP
EDGE
TURN
DMZ Internet Private Intranet /
Wifi / 4G etc Corporate Intranet
tunnelled RTP
SIP
STUN/
TURN/
ST
Proxy CS
SIPPR
VMCU
VMGR
CM
HTTPS VP
Mobile Clients use an HTTP Reverse Proxy Server (HRPS) to talk to the Sametime Proxy Server which translates from HTTPS to Virtual Places (VP), allowing Mobile Clients to access all of the services of Community Server
Mobile Clients rely on SIP EDGE server for SIP to reach SIPPR and TURN server for RTP to reach intranet - such as VMCU or other clients
An External ST Connect Client would use a Sametime Multiplexer (MUX) in the DMZ instead of HRPS and ST Proxy but would still use SIP EDGE and TURN servers
The Sametime Meetings zero-download browser client plugin for AV also uses the Sametime Proxy Server (and HRPS if external)
BWM
DB2
APNs
SIP
RTP
Client to External/Mobile Client Call
Client
CS CM
SIPPR
BWM
Client
VP
VP
SIP
SIP
SIP
SIP
RTP
1
2
3
5
4
8
7
6
SIP
EDGE
TURN
DMZ
Inte
rne
t
Priva
te In
tra
ne
t /
Wifi / 4
G e
tc
Co
rpo
rate
In
tra
ne
t
9
12
tunnelled
RTP
10
SIP
11
STUN/
TURN
Flow is as Client to Client flow but SIP Edge server handles SIP to External/Mobile Client (9)
External/Mobile Client uses Interactive Connectivity Establishment (ICE) with STUN (Session Traversal Utilities for NAT) / TURN (Traversal Using Relay NAT) server to determine all RTP candidates (10)
before media flows - which in this case uses TURN server to relay the RTP (11,12)
SIP
Conference Leg with External/Mobile Client
CS CM
SIPPR
BWM
Client
VP
from ST Proxy
or MUX in DMZ
VP
SIP
SIP
SIP
SIP
RTP
1
2 3
5
4
8
7
6 SIP
EDGE
TURN
DMZ
Inte
rne
t
Priva
te In
tra
ne
t /
Wifi / 4
G e
tc
Co
rpo
rate
In
tra
ne
t
13
tunnelled
RTP
10
SIP
11
STUN/
TURN
9
VMGR
VMCU
SIP
12
Flow is as Conference Leg flow but SIP Edge server handles SIP to External/Mobile Client (6)
External/Mobile Client uses ICE with STUN / TURN server to help determine RTP candidates (7) before final negotiation of RTP stream - which in this case uses TURN server to relay the RTP (12,13)
SIP
Considerations for Mobile/External Clients
Split Horizon DNS if have internal and external service availability – inside and outside addresses for:
– SIP Proxy and Registrar / SIP EDGE
– TURN Server (0.0.0.0 internally)
– Sametime Proxy Server / HRPS
– Sametime Meeting Server / HRPS
– Community Server / Mux
Consistent domain name (eg, thinkrite.com) for LTPA tokens to work correctly
TLS Certificates from official Certificate Authority using this consistent domain name
For STUN/TURN no NAT can be configured and firewalls must be in transparent/bridging mode as Clients must be able to connect to TURN servers in DMZ directly for STUN (3478) to work
VMCU must be able to talk to TURN in the same way and send/receive RTP (20830+/40000+)
Troubleshooting Mobile/External Clients
488 Not Available Here often indicates an unexpected failure to establish AV via ICE/STUN/ TURN – check that TURN server (via its hostname on STUN port 3478) AND other Client is reachable (Firewalls / NAT / VPNs / routing / DNS may prevent it – this may not be immediately evident as both Clients may be able to chat through CS/MUX/Proxy, reach SIPPR/EDGE, etc.)
ICE time-out errors – AV may still be established – network/negotiations may be strangely slow – try changing RTO in Media Manager ICE properties in Sametime System Console to 500
Bigger On the Inside When It's Working Troubleshoot It When It's Fixed Make It Mobile Bigger On the Outside When It's Resilient, Break It When It's Secure Hack It Lessons Learned Outside the Box
Scaling Up SIPPR/CM: WAS-SIP Container-Based Servers
WebSphere Application Server can be clustered vertically (on same machine) or horizontally (on different machines) – in either case the active memory is shared
For SIP Applications the amount of communications to share active memory between physical machines is very high
WAS Clusters must be fronted by a WebSphere Proxy Server (simple to create using SSC/Deployment Manager) with the main IP address, this is a stateless SIP Proxy which load balances WAS instances and offloads the actual TCP/IP or TLS connections from them
WPS WAS1
WAS2
WAS Cluster
Shared Environment
WS Proxy Server
Distributes Load,
Maintains Session
TCP/
UDP/
TLS
Gotchas for Scaling up Conference Manager / SIP Proxy & Registrar
Without Clustering both CM and SIPPR can be on same server, but with Clustering they must be in separate clusters
Limiting factor is the ability of WS Proxy to handle connections (now in SIP/SIPS_PROXY_CHAIN > inbound channel, was 20,000 before)
OS capabilities may need to be tuned as may external factors such as LDAP
Installing multiple WAS instances on the same machine may result in port conflicts (can be resolved by manual editing or WAS 8.5.5.2)
Some manual editing of files outside of SSC configuration is required - clustered CMs each need a separate stavconfig.xml file with a different NotificationServerHost (CM’s own FQDN) / NotificationServerPort (normally 9443)
– http://www-01.ibm.com/support/docview.wss?uid=swg21663243
How One becomes Many – Scaling Up
PS1 CM1
CM2
CM WAS Cluster CM WS Proxy Server
PS2 PR1
PR2
SIPPR WAS Cluster SIPPR WS Proxy Server
CM
PR
Clustered Media Manager
Standalone Media Manager
(Could also include SSC and DB2) SSC
DB2
SSC
DB2
SSC includes deployment
manager for all CM,
SIPPR, PS, etc.
:5080
SIP
SIP
:5060
:508x
:508y
:506x
:506y
:5060
:5080
Gotchas for Scaling up ST 9 SIPPR
Single Handled Domain must be configured for ST9
– Use the same FQDN as the DNS for SIPPR, same domain as in your certificates
– Clients/trunks setting this domain is all important – all incoming calls/SIP is expected to feature this name in the Request URI/To headers for SIPPR to use rules to send calls to clients – all other SIP will just be forwarded according to Request URI (which could result in a loop and 483 Too Many Hops if that address comes back to SIPPR itself)
– For Sametime Voice/Phone/SUT-Lite Conference Manager constructs a MESSAGE for client notification based on the received INVITE, only sending it to the Proxy Registrar if the Request URI for a received call matches the SIP Proxy Registrar FQDN shown in stavconfig.xml
sippr.thinkrite.com
Scaling Up VMGR and VMCU Servers VMCUs run on Linux only and are not
WebSphere/Java-based, they can be configured in resource pools for specific geographic areas or for other purposes
VMGRs while running with SIP in WebSphere (on Linux only) do not use the WebSphere SIP Container so cannot use the WebSphere Proxy – they include their own load balancer component running on ports 5080 and 7443 instead of 5060 and 8443
Solid database replicates information from Master (M) to Hot Standby (HS) and other Replicas (R)
VMGR1
VMGR2
VMCU1
VMCU2
VMGR3 VMCU3
VMCU Farm VMGR Farm
Distributes Load,
Maintains Session
VMCU pool 2
VMCU pool 1
:5060
:8443
:5060
:8443
:5060
:8443
:5060
:8080
:5060
:8080
:5060
:8080
VMGR
MLB
VMGR
HSLB
VMGR
RLB :5080
:7443
:5080
:7443
:5080
:7443
“End to End” AV Scaling (without EDGE/TURN)
VMGR1
VMGR2
VMGR Farm
VMGR
LB1
VMGR
LB2
WPS3
BWM Cluster WS Proxy
PR1
PR2
SIPPR Cluster with WS Proxy
WPS2 CM1
CM2
CM Cluster with WS Proxy
WPS1 BWM1
BWM2
DB2
VMCU1 VMCU2 VMCU3
VMCU Farm
Client
Calls to Clients
Inbound Calls
(SUT-Lite)
Conference Calls
CS CS
Bigger On the Inside When It's Working Troubleshoot It When It's Fixed Make It Mobile Bigger On the Outside When It's Resilient, Break It When It's Secure Hack It Lessons Learned Outside the Box
“When it’s Resilient, Break It!”
(Take full backups and test restore procedure first!)
Perform Failover testing, initially “gently” but also try more severe tests
Have clients logged in and make calls at the time of the tests to see what happens
Redundancy for WAS-SIP Container-Based Servers like SIPPR and CM
For a single IP address to reach these clusters use simple Load Balancers/IP Sprayers such as WebSphere EDGE Components LB for IPv4/6 or F5 BIG-IP LTM
WPS1
WPS2
LB1
LB2
WAS1
WAS2
WPS3 WAS3
(Virtual IP
Address) Can
Failover to…
TCP/
UDP/
TLS
WAS Cluster
Shared Environment
WS Proxy Servers
Distribute Load,
Maintain Session
TCP/
UDP/
TLS
Load
Balancers
Sprays IP/details
from single address
VIP
Load Balancers
One Load Balancer server can theoretically be used for all Sametime Servers (a redundant pair is obviously recommended!)
– Needs a FQDN and Virtual IP address for each Sametime Service (SIPPR, CM, VMGR, Proxy, Meetings, TURN) – plus its own physical address(es)
MAC Forwarding – fastest option (and LB out of IP connection) but must be on same VLAN
– Necessary to set up a loopback (extra, non-ARP, not in routing table) IP address on the WS Proxy etc. to receive packets from the Load Balancer
– LVS/IPVS uses same technique for “Direct Connection” (F5 calls this L2 nPath routing)
Other methods overcome VLAN/loopback limitations but are slower and interfere more
– Encapsulation/Tunnelling (F5 L3 nPath routing), NAT/SNAT (source address translation), etc.
– With SNAT must configure special settings in WS Proxy to rewrite packet details – IP address of Load Balancer to FQDN of service
/etc/sysctl.conf / sysctrl –w net.ipv4.conf.all.arp_ignore=3 net.ipv4.conf.all.arp_announce=2
ip addr add $CLUSTER_ADDRESS/32 scope host dev lo
Gotchas for Scaling up
The Load Balancer must be extremely simple for the WS Proxy / Application Server logic to work correctly
– Ideally just the Layer 2 (MAC) address details of the IP packet are changed to forward the packet, allowing the WS Proxy to take over negotiating the entire TCP/TLS session
– If the Load Balancer is to actually read and forward a new TCP/TLS packet no SIP details should be changed and no new headers should be added
– For F5 BIG-IP Local Traffic Manager (LTM) do not configure SIP / SIP Persistence / mblb profiles as these result in LTM acting like a SIP UA/Proxy and Via/Record-Route headers are added – this results in lost connections after around 5 minutes because:
- the WS Proxy detects this SIP UA is in front of the client and doesn’t add RFC 5626 flow tokens
- special TCP keep-alive messages on the SIP connection do not make it through the F5
WS Proxy Health Check Settings
An intelligent Load Balancer will only send packets to online WS Proxies – which they can determine from responses to SIP OPTIONS requests – the WS Proxy should respond immediately to such OPTIONS
(in comparison the WS Proxy uses Distribution and Consistency Services (DCS) rather than SIP to determine if its Application Servers – eg, SIPPR - are running)
If you need to configure more than two addresses it is possible to modify the comma separated LBIPAddr setting in the file proxy-settings.xml – but returning to the configuration page will remove all but the first two addresses
WS Proxy IP Forwarding Load Balancer and other Custom Properties contactRegistryEnabled false for faster shutdown
disableAllHostNameLookups should be set to true for performance, this does not affect the use of hostnames in the below IPSprayer settings…
tcp/tls/udp.IPSprayer.host is the hostname of the virtual IP of the load balancer – ie, for the SIPPR it is the hostname of the address to which clients expect to connect
ipForwardingLBEnabled true – replaces the host and port from LB with the IPSprayer.host/port details
isSipComplianceEnabled false to avoid logging interoperability events for TCP keep-alives, etc.
enableMultiClusterRouting true to allow (eg, keep-alive) packets with apparently invalid routing info to SIPPR
http://www-01.ibm.com/support/docview.wss?uid=swg21666746
WS Proxy Custom Property for Older Clients
Older clients (Including ST 8.5.2 embedded in Notes 9 – especially common on Linux where full AV/SUT is not yet available in ST 9) need special handling:
– Import WebSphereSIPProxy/ConnectionReuseFilter.jar from disk 1 of Media Manager as an Asset on the WebSphere Proxy Server
– Configure a Business Level Application (BLA) and BLA CU (Composition Unit) using this artefact
– Set forceRport=true custom property
http://www-01.ibm.com/support/knowledgecenter/SSKTXQ_9.0.0/admin/install/inst_config_clus_av_sippr_wasproxy_filter.dita
How One becomes Many – SIPPR/CM Redundancy
CM
PS1
PR
PS1
LB1
LB2
CM1
PR1
CM and PR WAS Clusters WS Proxy Servers Load
Balancers
VIP
CM
PS2
PR
PS2
CM2
PR2
CM
PR
Clustered Media Manager
Standalone Media Manager
(Could also include SSC and DB2) SSC
DB2
DB2
SSC
DB2 HADR
SSC includes deployment
manager for all CM,
SIPPR, PS, etc.
SIP
:5080
:5080
SIP
SIP
:5060
:5080
:5060
:508x
:506x
:508y
:506y
:5060
:5080
:5080 VIP
Redundancy for VMGR and VMCU Servers VMCUs run on Linux only and are not
WebSphere/Java-based, they can be configured in resource pools for redundancy
VMGRs while running with SIP in WebSphere (on Linux only) do not use the WebSphere SIP Container or WebSphere Proxy – they include their own load balancers which are aware of where requests were previously sent and are being handled
For a single IP address to reach the VMGRs use an IP Sprayer which is SIP (5080/5081) and HTTP/HTTPS (7443) compliant (the same as for other Sametime servers is fine)
VMGR1
VMGR2
IS1
IS2
VMCU1
VMCU2
VMGR3 VMCU3
VMCU Farm VMGR Farm
Distribute Load,
Maintain Session
IP Sprayers
Sprays IP/details
from single address
VMCU pool 2
VMCU pool 1
(Virtual IP
Address) Can
Failover to…
VIP
:5060
:8443
:5060
:8443
:5060
:8443
:5060
:8080
:5060
:8080
:5060
:8080
VMGR
MLB
VMGR
HSLB
VMGR
RLB :5080
:7443
:5080
:7443
:5080
:7443
“End to End” AV Redundancy (without EDGE/TURN)
LB1
LB2
Load
Balancers
VIP
LB1
LB2
Load
Balancers
VIP
VMGR1
VMGR2
VMGR Farm
VMGR
LB1
VMGR
LB2
LB1
LB2
VIP
WPS5
WPS6
LB1
LB2
BWM Cluster WS Proxys Load
Balancers
VIP
PR1
PR2
SIPPR Cluster with WS Proxys
WPS3
WPS4
CM1
CM2
CM Cluster with WS Proxys
WPS1
WPS2
DB2
BWM1
BWM2
DB2 HADR
VMCU1 VMCU2 VMCU3
VMCU Farm
Client Calls to Clients
Inbound Calls
(SUT-Lite)
Conference Calls
CS CS
Load
Balancers
How Highly Available is a Clustered Sametime AV Environment? Failover of a MAC-Forwarding Load Balancer should not affect calls
– Load Balancer is not involved in the actual connection, only new incoming connections
– Connection information can also be replicated from one Load Balancer to its partner(s)
Loss of a WebSphere Application Server should not affect calls – shared environment
– However some SIP being processed by that Application Server could be lost, disrupting call set-up, tear-down or continuation of a very small number of calls
Loss of a WS Proxy will result in calls being lost
– Unless you use UDP (which cannot normally cope with the size of packets which include all the Sametime Codecs) the TCP/TLS connection from the client was established to a specific WS Proxy so if that goes down its connections are dropped
– Each connection is a client’s ability to make/receive/continue calls so any calls are lost and the clients will have to re-REGISTER when they detect the failure (within 1 minute, configurable)
– WS Proxies can be clustered but this does not provide High Availability / Connection information being shared or any method to maintain TCP/TLS connection
SIPSM2
SIPSM1 CSTASM1
High Availability Comparison – Sametime Unified Telephony
LB3
LB4
VIP
PR1
PR2
SIPPR Cluster with WS Proxys
WPS3
WPS4
Client
SIPSM1
SIPSM2
Active/Active Telephony Control Server
(TCS) Cluster 99.999% available
UCE1 VIP
FW1
MS1
FW2
MS2
Telephony Application Server Cluster with
Hot Standby:
Framework (FW) and Media Server (MS) on
one SAN partition and WebSphere
Application Server (WAS) on another
WAS1
WAS2
VIP
VIP
VIP
SAN
System Automation for MultiPlatforms
(SAMP) and Reliable Scalable Cluster
Technology (RSCT) manages failover to
spare node
CSTASM2
Hot/Hot Solid DB replication
Hot/Hot Universal Call Engine (UCE)
with shared call context memory
SIP Service Manager (SIPSM) and
Computer Supported Telecoms Apps
Service Manager (CSTASM) can failover
Solid
DB
Solid
DB
UCE2
Softphone calls still go through SIPPR
Cluster
IP
PBX IP
PBX
CS CS
FW?
MS? WAS?
Comparing Other types of High Availability and Scalability High Availability Disaster Recovery (HADR) replication for DB2 server pair with SAMP/RSCT
handling failover – no Virtual IP Addresses/Aliases – DB2 clients aware of both servers
VMware High Availability – much like SUT TAS but fails over the entire virtual machine
VMware Fault Tolerance – much like SUT TCS, second virtual machine in vLockStep becomes active upon failure of first - but can only use one vCPU until SMP-FT in ESXi 6.0
Scalability and Redundancy for Other Sametime servers
SIP EDGE Servers scale up in the same way as SIPPR and CM using WS Proxy and LBs
Sametime Meeting Servers scale up in the same way using WS Proxy for HTTP and LBs
Bandwidth Manager can scale up in the same way but only with two nodes
– uses WAS7 so needs its own Deployment Manager to configure the cluster
Sametime Proxy Servers do not need WS Proxy Servers (just Load Balancers)
TURN Servers can be fronted by IP or MAC Forwarding Load Balancers – http://www-01.ibm.com/support/knowledgecenter/SSKTXQ_9.0.0/admin/install/inst_config_turn_properties.dita
– Remember that no NAT can be configured and firewalls must be in transparent/bridging mode, Clients must (appear to) be able to connect to TURN servers in DMZ directly
Bigger On the Inside When It's Working Troubleshoot It When It's Fixed Make It Mobile Bigger On the Outside When It's Resilient, Break It When It's Secure Hack It Lessons Learned Outside the Box
“When it’s Secure, Hack It!”
HTTPS / TLS / SRTP should be configured
– Force web traffic to SSL/TLS using boundary devices/firewalls
– Test media with TCP/RTP first and then switch to TLS/SRTP and re-test
– 3rd party devices may need certificates exchanged for TLS to work
– If need be can have some (eg, intranet to VCS) connections using TCP and others using TLS
Certificates from official Certificate Authority should be used on internet side
Discover what it takes to decode TLS using Wireshark
Discover what it could take to commit fraud or a DoS attack
Appreciate why you need to keep certificates and their (non-default!) passwords safe
Tighten security as a result of any findings and re-test to check nothing is broken openssl pkcs12 -
in k
ey.
p12 -
nocert
s -
nodes -
out
decry
ptk
ey.
pem
SSO and Securing anonymous access Edit stavconfig.xml changing SIPAuthenticationType to LTPA if have configured SSO
Enable anonymous access by token authentication on CS to avoid DoS attacks
http://www-01.ibm.com/support/knowledgecenter/SSKTXQ_9.0.0/admin/config/st_adm_security_allow_token_auth_enable.dita
Ensure there is an anonymous user in LDAP
Put the shared key txt files in a directory which can be found – with appropriate permissions - on both SIPPR and CM (not in regular WAS profile directories which are unique per system) and set shared secret key paths in WAS Trust Association Interceptors, restart SIPPR and CM and check stavconfig.xml has these paths
Set TURNTokenAuthEnabled=true if clients are all ST 9.0 (TURN authentication not supported by previous clients)
For TURN server put file from SecretKeyPathForTurnAuthToken and key txt files in root directory / and put filenames in TurnServer.properties
Bigger On the Inside When It's Working Troubleshoot It When It's Fixed Make It Mobile Bigger On the Outside When It's Resilient, Break It When It's Secure Hack It Lessons Learned Outside the Box
Heads Up on Common Issues – IP Telephony
Restrictions on packet size (eg, UDP / SIP-aware firewalls) causes issues with the long list of codecs, ICE/STUN/TURN candidates and encryption options in SIP from Sametime Clients and VMCU – IP telephony may not have hit this issue in the same environment
G.729 is not currently supported except with SUT, iLBC is not yet supported in ST9, calls over the WAN may be prevented from using G.711 – discuss your needs and options with IBM
Lossy codecs especially in combination or used twice (eg, on an external conference bridge) may produce poor voice quality – ensure such use cases are evaluated
SIP session timers may provoke issues – set these low for testing and high for production
Test on/off hold, transfers, any conferencing and other special features like bridging, TLS...
Heads Up on Common Issues – WiFi and Firewalls
Corporate WiFi is a completely different environment to Mobile Data – test both!
Corporate Guest WiFi is another different environment - ensure the expectation and/or testing receives focus early-on as changes in this environment is a sensitive area
WiFi in other environments (some airports, hotels, etc.) may also be too restrictive
– Using Mobile Data instead by switching off WiFi on phone would be expected to work
Move non-standard ports to 80 and 443 where possible to overcome firewall issues, or specifically ask for firewalls to be opened for SIP (5060) and TURN (3478) and RTP
Lessons Learned in Hosting
VMCU really needs dedicated hardware meeting minimum spec (4 core, 8GB RAM) which is best placed on-premises in customer data center to keep latency to a minimum
VMCU requires eth0 to be used for its connection to VMGR - this is not generally possible to create without access to a Bare Metal Server (BMS)
Once you have one BMS get a second for redundancy and/or high speed (consistent performance guaranteed iops) shared storage between them for clones
Reserving CPU, memory and bandwidth as documented are all important in high-performance enterprise environments (much less so in small evaluations but reserve now or suffer later)
BMS reboots can cause datastore corruption – resist the urge to exploit simplistic automated monitoring which can result in this!
Why I UNIX/Linux/AIX/…
Pick an OS which gives you fast, secure access to the command line and the ability to troubleshoot the entire foundation of the system from that command line including the boot process and background processes
Standardize on one OS … logically RHEL or SLES by virtue of VMGR / VMCU
Make exceptions where necessary (eg, Document Conversion, ability to restart services without restarting entire Community Server)
OS Tips
Use bonding to both protect against physical adapter failure and simplify virtual machine cloning
Reduce TCP keepalive time to prevent backed-up queues – net.ipv4.tcp_keepalive_time=60
Reduce TCP final timeout to allow connections to end faster – tcp_fin_timeout=30
Check and increase default system limits – ulimit / limits.conf / syctl.cnf
Use LVM with ext3/ext4 and leave space for snapshots
Install wireshark before you need it
Draw a diagram
Draw (or purloin) some deployment diagrams to share with IBM support – they will ask you for them
– at a minimum include all Sametime components, proxies, load balancers
– if possible include additional detail on firewalls, VPNs, etc.
Bigger On the Inside When It's Working Troubleshoot It When It's Fixed Make It Mobile Bigger On the Outside When It's Resilient, Break It When It's Secure Hack It Lessons Learned
Phones Outside the Box
The call flows we showed included only Clients – but the scenarios can also involve Phones
Sametime Meetings can also call out to Phones with simple SIPPR rules
– Condition: Method=INVITE RequestURI=sip:[0-9]{6}@.*
– Destination: Request-URI pattern=sip:(.+)@.* Output pattern=sip:[email protected]:5060;transport=tcp
– Also set TelephoneConferenceEnabled=true in ConferenceManager.properties in /opt/IBM/WebSphere/profiles/*/installedApps/*/ConferenceFocus.ear/ConferenceFocus.war
Phones can also call into conference calls set up by Sametime Meetings
– Condition: Method=INVITE RequestURI=sip:[0-9]{4}@.* Source Address=ippbx.x.y.com
– Destination: sip:stvmgr.x.y.com:5060;transport=tcp
ST telephone numbers
ST will normally REGISTER using what is in the telephoneNumber field from LDAP
In fact ST really uses whatever is in Person document cache – which is taken from the Business Card
Business Card can be changed in SSC or by editing XML but the Telephone Number field should normally show the PSTN number
If there is no (valid) telephoneNumber then some outbound calls may work using the e-mail address registration for P2P calls – but a valid telephoneNumber is required for reliable ST telephone number and/or sip dialling
Obviously the numbers in LDAP for ST must be unique!
How can you call ST from a phone?
Assuming a user has a real phone and its number is in telephoneNumber then calls to telephoneNumber would go to the real phone
An internal dialling code could be used to reach the softphone instead if IP PBXes can transform the dialled number (SIPPR configuration cannot transform a received to another number – the “To:” header cannot be manipulated – what is received in INVITE must match what is REGISTERed)
An external dialling convention is not possible but a call-forward on the real phone could reach the softphone
SUT has superior support both for allowing a user to select their preferred device to receive a call on and integration without call-forwarding, called and calling number translation, etc.
ST Plugin allows a field other than telephoneNumber for softphone
Custom Plugin allows use of other Business Card fields or other LDAP fields
Often users are allowed to edit their Telephone Number
– by using another field issues with user-edited numbers can be eliminated
Different capabilities are available with different vendors – external TCSPI integration allows CS / CM to start conferences and provide moderator controls
Some integration can be achieved through sip addressing (ST )
– (Outbound) Condition: Method=INVITE Request URI=.*@x\.y\.com.*
– Destination: Request URI pattern=sip:(.+)@.* Output pattern=sip:[email protected]:5060;transport=tcp
– (Inbound) Condition: Method=INVITE Source Address=dma.x.y.com
– Destination: sip:stcm.x.y.com:5060;transport=tcp (Push Route)
It is also possible to integrate with 3rd party video clients through such a bridge
– (Outbound) Condition: Method=INVITE Request URI=.*\.3pvc.*
– Destination: Request URI pattern=sip:(.+)@.* Output pattern=sip:[email protected]:5060;transport=tcp
3rd Party Video Conferencing
Monitoring Inside and Outside the Box
ThinkRite managed services hinge on pro-active monitoring scripts and server dashboard (not a product) which run 24/7 and notifies staff of potential issues – many scripts run checks on the servers but dedicated SIP and VP (watchit) bots run on intranet and internet and can send independent alerts, as can the dashboard itself if updates dry up
There are many interfaces which may be useful for monitoring, identify which you can use
– the Bandwidth Manager ISC and STDBBWM database are particularly useful for monitoring calls
db2 -x "select cast(fromuserid as varchar(50)),cast(touserid as varchar(50)),endtime,endreason from bwm_media_sessions where
starttime > (current_timestamp -1 day)"
– also logs of the Conference Manager in …WebSphere/AppServer/profiles/*/logs/STMediaServer
callsummary.log.0 conference.log.0
– (REST) APIs on Conference Manager, Video Manager, … (see Links)
Greatest assurance of course remains Connect Client tests for which watchit is invaluable
Useful Links
http://www.slideshare.net/a8us/utf-8enibm-sametime-9-voice-and-video-deployment
http://www-01.ibm.com/support/docview.wss?uid=swg27040186&aid=1
http://www-10.lotus.com/ldd/stwiki.nsf/xpViewCategories.xsp?lookupName=Voice%20and%20Video
– BWM deployment best practices, CM and VMGR REST APIs, new tricks for ST AV…
https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/WebSphere+SIP+and+CEA/page/Configuring+and+Deploying+WebSphere+SIP+Environments
http://www-01.ibm.com/support/knowledgecenter/SSAW57_8.5.5/com.ibm.websphere.nd.doc/ae/tsip_tunelinux.html
Links not relevant to Sametime AV
https://www.ibm.com/developerworks/community/wikis/home?lang=en#/wiki/WebSphere%20SIP%20and%20CEA/page/Achieving%20High%20Availability%20with%20WebSphere%20Application%20Server%20SIP%20Container%20and%20F5%20BIG-IP%20Local%20Traffic%20Manager (does not apply to Sametime!)
http://www.f5.com/pdf/deployment-guides/ibm-sametime-dg.pdf (does not include SIP!)
Related Sessions
Mon 1:00pm Mockingbird 1 & 2 MAS204 IBM Sametime Deployment Do’s and Don’ts: Tips, Tricks, Perils and Pitfals
Tues 1:00pm Swan SW 1-2 BP103 Solving the Weird, Obscure and The Mind-Bending
Tues 3:45pm Dolphin S Hem 1 ID102 IBM Sametime: Design and Implementation of a full HADR Deployment
Weds 10:30am Mockingbird 1&2 ID109 Digital Nightmares – The Biggest Performance Killers in Your Environment
Weds 11:45am Swan SW 7-10 ID112 Connect the Dots: IBM Sametime Audio/Video Planning, Deployment, Troubleshooting and Beyond
Weds 1:30pm Dolphin S Hem 1 ID108 Mobile Security Roundup
Who Was That Man?
Jeremy Sanders, Msc (Proj Mgmt) is the Chief Technical Officer of ThinkRite Ltd
(UK/EMEA) and continues to work with the ThinkRite team to integrate and develop enhancements for IBM SUT, Sametime Voice/Softphone (”SUT-Lite”) and IBM Unified Messaging. He’s been involved with IBM in development, integration, support and administration of what we now call Unified Communications for over 20 years.
For further details see the first few slides of last year’s presentation…
http://www.slideshare.net/kbmsg/jmp206
ThinkRite Ltd is the European division of ThinkRite Inc/ThinkRite Pty
ThinkRite provides Sametime/SUT installation services, managed services, hosting services, development services and innovative products including ThinkRite Assistant – Single Click to connect to all voice and web meetings using Sametime softphone and Mobile clients
http://www.thinkrite.com/brochures/ThinkRite%20Assistant%20Brochure.pdf
Think What?
One unique system for internal and external
Secured VPN to connect to Directory and PBX if needed
Available anywhere and on mobile devices without VPN access
Cloud 9.0
Engage Online
SocialBiz User Group socialbizug.org
– Join the epicenter of Notes and Collaboration user groups
Social Business Insights blog ibm.com/blogs/socialbusiness
– Read and engage with our bloggers
Follow us on Twitter
– @IBMConnect and @IBMSocialBiz
LinkedIn http://bit.ly/SBComm
– Participate in the IBM Social Business group on LinkedIn
Facebook https://www.facebook.com/IBMConnected
– Like IBM Social Business on Facebook